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1. Introduction 

Suppose that = (Ci, ■ ■ ■, Cd)~^ e is a random vector with independent components satis¬ 
fying E(Cj) = 0, j = 1,... ,d. Additionally, let Q be a d x d positive semidehnite matrix with 
real (non-random) entries. Quadratic forms C^QC have been studied for decades in statistics 
(Chatterjee, 2008; Gotze and Tikhomirov, 1999, 2002; Hall, 1984; Hanson and Wright, 1971; 
Sevastyanov, 1961; Whittle, 1964). This paper is largely motivated by recent applications 
involving random-effects models, which rely heavily on properties of quadratic forms (e.g. 
de los Campos et al., 2015; Dicker and Erdogdu, 2015; Jiang et al., 2014). In the hrst part of 
the paper, we give two new hnite sample bounds for quadratic forms — a uniform concentra¬ 
tion inequality (Theorem 1) and a normal approximation result (Theorem 2) — which may 
be useful in a variety of statistical applications. The second part of the paper focuses on appli¬ 
cations of Theorems 1-2 related to variance components estimation in linear random-effects 
models, including non-standard models with correlated random effects (cf. Proposition 2). 

Theorems 1 and 2 are the main theoretical results in the paper. We rate the novelty of 
our normal approximation result Theorem 2, which is a multivariate normal approximation 
results proved via Stein’s method for exchangeable pairs, higher than that of Theorem 1. 
However, the main emphasis of both results is convenience for use in applications. 

Our concentration bonnd. Theorem 1, is a uniform version of the Hanson-Wright ineqnality 
for qnadratic forms. The method of proof for Theorem 1 is relatively standard - combining a 
chaining argument from empirical process theory (e.g. Chapter 3 of Van de Geer, 2000) with 
the pointwise-bound of the original Hanson-Wright inequality - and it should be possible 
to generalize our result to larger classes of quadratic forms, similar to (Adamczak, 2014). 
However, we note that while Theorem 1 is restricted to relatively simple (Lipschitz) classes 
of quadratic forms, it is not a corollary of the uniform bounds in (Adamczak, 2014), which 
require a stronger condition on the distribution of C (see the comments in Section 3.1 following 
the statement of Theorem 1). 

Theorem 2 is a normal approximation result for vectors of quadratic forms. Most of the 
existing normal approximation results for quadratic forms are asymptotic results (Hall, 1984; 
Jiang, 1996; Whittle, 1964), reqnire the random variables Q to be iid (Chatterjee, 2008; 
Gotze and Tikhomirov, 1999, 2002; Hall, 1984), or have other limitations (Sevastyanov, 1961). 
Theorem 2 gives a non-asymptotic normal approximation bound, which applies to with 
independent (but not necessarily identically distributed) sub-Gaussian components. Further¬ 
more, in contrast with most existing results on quadratic forms, which are predominantly 
univariate. Theorem 2 is a multivariate result, which applies to vectors of quadratic forms 
(C^QiC) • • •) C^QkC)-: for positive semidehnite matrices Qi,..., Qk (the applications to random- 
effects models considered in Section 4 require K = 2). The proof of Theorem 2 relies on Stein’s 
method of exchangeable pairs and follows the embedding approach of Reinert and Rollin 
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(2009). Theorem 2 and its proof shares similarities with Proposition 3.1 of Chatterjee (2008). 

However, Proposition 3.1 of Chatterjee (2008) applies only to a single quadratic form 
in iid Rademacher random variables Q satisfying = 1) = P((Cj = —1) = 1/2. 

Linear random-effects models are studied in Section 4. Asymptotic results for quadratic 
forms serve as the theoretical underpinning for many applications involving random-effects 
models (Hartley and Rao, 1967; Jiang, 1996, 1998). However, new applications of random- 
effects models in genomics have pushed the boundaries of existing theoretical results (de los Campos et ah, 
2015; Golan and Rosset, 2011; Jiang et ah, 2014; Speed et ah, 2012; Yang et ah, 2010, 2014; 

Zaitlen and Kraft, 2012). In Section 4, we present new non-asymptotic bounds for variance 
components estimators in linear random-effects models. To our knowledge, these are the hrst 
hnite sample results on the statistical properties of variance components estimators. Many 
now-classical asymptotic results for random-effects models (e.g. Jiang, 1996) follow as corol¬ 
laries of our hnite sample results in Section 4. More signihcantly, non-asymptotic bounds, 
like those in this paper, provide increased hexibility for use in applications. In particular, our 
results can be easily applied in non-standard settings, where it is less clear how to adapt the 
existing asymptotic theory; see, for example Proposition 2, which applies to random-effects 
models with correlated random-effects, and (Dicker and Erdogdu, 2015) for an application 
involving hxed-effects models. 

The rest of the paper proceeds as follows. Some basic notation is introduced in Section 2. 

The main results are stated in Section 3. Linear random-effects models are studied in Section 
4. The proofs of Theorems 1-2 and Propositions 1 and 3 are contained in the Appendices; 
other results are proved in the Supplementary Material. 


2. Notation 


If u = (wi,..., Up)~^ e RP, then ||u|| = («? + • • --t-Up)^/^ is its Euclidean norm. For a dxm matrix 

( d m 1 

A = (ttij) with real entries, let ||A|| = sup||x|[=i ||Ax|| and ||A||hs = be the 

operator norm and the Hilbert-Schmidt (Frobenius) norm of A, respectively. If / ; R™ ^ R 
is a function with /c-th order derivatives, dehne 


dx 


\f\j= sup 

xsR"* 




dxi. 


■/(x) 


, j = 


and let |/|o = sup^gum |/(x)|. Additionally, dehne C^(R™') = {/ : R”^ ^ R; \f\j < co, j = 
0,1,..., fc} to be the class of real-valued functions on R™ with bounded derivatives up to order 
k. Finally, following (Vershynin, 2010), let ||C||p 2 = sup^jj;^ sub-Gaussian 

norm of the real-valued random variable 
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3. Results for quadratic forms 
3.1. Uniform concentration bound 

The Hanson-Wright ineqnality is a classical probabilistic bonnd for qnadratic forms, which 
has been the snbject of renewed attention recently in applications related to random matrix 
theory (e.g. Adamczak, 2014; Hsn et ah, 2012; Rndelson and Vershynin, 2013). Theorem 1 
is a nniform version of the Hanson-Wright ineqnality, which applies to families of qnadratic 
forms C'^Q(u)C, where Q{u) is a matrix fnnction of u e As illnstrated in Section 4, 
Theorem 1 has applications in the analysis of random-effects models; more broadly, it has 
applications in M-estimation and maximnm likelihood problems with non-iid data. 

Theorem 1. Let 0 < R < co and let ti(u),... be real-valued Lipschitz functions on 

[ 0 , c satisfying 

max |tj(u) — fj(u')| < L||u — u'll, u, u'e [ 0 , R]'^, ( 1 ) 

for some constant 0 < L < oo. Let T(u) = diag{ti(u),.. ., let V be an d x m matrix, 

and define (5(u) = VT{\i)V^. Additionally, let C, = (^i, ..., e where Ci, • • •, Cd 
independent mean 0 sub-Gaussian random variables satisfying 

max 11011^2 ^ 7 (2) 

2=1,...,a 

for some constant 7 e (0, 00 ). Then there exists an absolute constant C e (0, 00 ) such that 


P 


sup |C^Q(u)C " E{C^Q(u)C}| > r 

ue[0,R]^ 


^ (7 exp 


1 r ^2 ^ 

— min < - - 

C l7l|17TH||2(||T(0)||^g + KL^'^m)' 'j^V^V\\{\\T{0)\\ + RL^LR) 


whenever r^ ^ C'f^V~^V\\‘^K^L‘^R‘^m. 


Theorem 1 is proved in Appendix A. In a typical application, the dimension K will be small 
(in Section 4, we nse Theorem 1 with K = 1) and m, d may be large. A nniform Hanson-Wright 
ineqnality, with a similar npper bonnd, is also given in (Adamczak, 2014). However, Adam- 
czak’s resnlt applies to random vectors satisfying a relatively strong concentration property 
and does not cover snb-Ganssian random vectors satisfying only (2); see Remark 4 following 
Theorem 2.3 in (Adamczak, 2014). 
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3.2. Normal approximation 

The main result of this section is Theorem 2, a multivariate normal approximation result 
for vectors of quadratic forms (C^QiCi ■ ■ ■ > ^ Theorem 2 may be viewed as a 

generalization of Proposition 3.1 in (Chatterjee, 2008), which applies to a single quadratic for 
C^QC in Rademacher random variables Q satisfying P(Cj = +1) = 1/2 (though our bound 
in Theorem 2 is not as tight as Chatterjee’s; see the discussion after the statement of the 
theorem). A proof of Theorem 2 may be found in Appendix B. The proof is based on Stein’s 
method of exchangeable pairs and the embedding technique from (Reinert and Rollin, 2009). 

Theorem 2. Let (i,... ,(d be independent sub-Gaussian random variables with mean 0 and 
variance 1, and assume that they satisfy (2). Let ^ = (Ci,..., Cd)"'" ^ Additionally, 

for k = let Qk = be an d x d positive semidefinite matrix and let Qk = 

diag(g{J\ ... ,giJ). Define Wk = C^QkC - tr(Qfc), ivk = C^QkC - tr(Qfc), and 


Wi 


■ 70iC-trWi) 

iui 


C^QiC-trWi) 

Wk 


C^QkC — ^^(Qk) 

Wk 


C^QkC — ^^(Qk) 


Finally, let z ~ N{0,I2k) and V = Cov(w). There is an absolute constant 0 < C < oo such 
that 

|E{/(w))-E{/(V'"z))| (3) 

< c(7 + 1 )® ( max lOtl) + Wd\f\,( max llftl) I, 

for all three-times differentiable functions f : ^ M. 

The upper-bound (3) does not appear to be optimal; cf. Section 5 of (Jiang, 1996) and 
Section 3 of (Chatterjee, 2008), where conditions for convergence depend on the ratios \\Qk — 
Qk\\‘^/'V8j:{C~^QkC) and tT{Ql)/ QkCV^ respectively. However, it is likely that (3) can 

be improved by carefully examining the proof in the Appendix, if one is willing to accept a 
more complex (and potentially less user-friendly) bound. Moreover, we argue presently that 
the bound (3) is already effective in a range of practical settings. Assume that in addition to 
the conditions of Theorem 2 , the Q are iid with excess kurtosis 72 = lE(Ci) — 3 ^ — 2 . Also, let 
af = VaL.T{C^QkC)- By Lemma S 8 from the Supplementary Material, 

al = 2tT{Ql) + 72tr(QD ^ (2 + 72)tr(g^). 
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Hence, the upper-bound in Theorem 2 implies that C~^QkC/(^k is asymptotically A^(0,1), if 

d^^^Qk\\\ dWQkf ^ d^/^QkV ,[ d^/^Qkf ] 

^ (2 + 72)tr(Ql) + 1 (2 + 72)tr(gD i ^ ' 

We conclude that if (i) hminf 72 > —2 and (ii) then C^QkC/(^k 

is asymptotically iV(0,l). Regarding (i), note that 72 > —2 for all distributions except the 
Rademacher distribution; furthermore, (ii) holds if, for instance, all of the eigenvalues of Qk 
are contained in a compact subset of ( 0 , 00 ). 

4. Linear random-effects models 

In this section, we apply the results from Section 3 to the variance components estimation 
problem in a linear random-effects model. We assume that 

y = X/3 + e, (4) 

where y = (yi,e M” is an observed n-dimensional outcome vector, X = (xij) is 
an observed n x p predictor matrix with x.^ e Mf, (3 = (/3i ,...jdpY e Mf is an unknown 
p-dimensional vector, and e = (ei,... e M"' is an unobserved error vector. We further 
assume that /li,..., /3p, ei,..., are independent random variables with 

E(ei) = 0 and Var(ei) = ctq, i = I,... ,n, 

E(/3j) = 0 and Var(/3j) = , j = 1,... ,p. 

Here, we assume that the {3j are all independent. In Section 4.3, we investigate a more gen¬ 
eral model with dependent random-effects and give a corresponding concentration bound. 
Throughout, we also assume that X is independent of e and (3. Overall, (4)-(5) is a lin¬ 
ear random-effects model with variance components parameters 6^ = (c’‘o)ho)- Observe that 
we have parametrized the model so that ? 7 q is a measure of the signal-to-noise ratio; this 
parametrization is standard (e.g. Hartley and Rao, 1967). 

Let 0 = (( 7 ^, 7 ^) and dehne the Gaussian data log-likelihood, 

m ^ -i log(.^) - E logdet (Exx- + /) - XA- + /) " y. 

Note that £{9) is the log-likelihood for 0, if /3 ~ A^{0, (? 7 oCro/p)/} and e ~ A^(0, erg/) are 
Gaussian. In this section, we study properties of the maximum likelihood estimator (MLE), 

6 = = argmax7(0), ( 6 ) 
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in settings where e and (3 are not necessarily Gaussian. [N.B. if f{9) in (6) has multiple 
maximizers, then use any pre-determined rule to select 0 .] 

The estimator 9 has already been widely studied in the literature, even in settings where 
e and j3 are not Gaussian (e.g. Jiang, 1996; Richardson and Welsh, 1994). In practice, 9 and 
other closely related estimators, such as REML estimators, are probably the most commonly 
used variance components estimators for linear random-effects models (Demidenko, 2013; 
Harville, 1977; Searle et al., 1992). Jiang’s (1996) work is especially relevant for the results 
in this section. Jiang studied models with independent random-effects and derived general 
consistency and asymptotic normality results for 9 that are valid in some of the settings 
considered here. However, asymptotic results tend to have more limited flexibility for use 
in certain applications. This has become more notable recently, with the widespread use of 
random-effects models in genomic and other applications, as discussed in Section 1. 

In Sections 4.2-4.4 below, we present hnite sample concentration and normal approximation 
bounds for 9, which follow from Theorem 1 and 2, respectively. These bounds have not been 
optimized and some of the quantities in the bounds can be extremely large for given values of 
9o axidp~^XX^ [e.g. k(? 7 q, (Tq, A)“^ and iy{r]Q,aQ,A), dehned in (10) and (19) below]. However, 
as described in the text below. Propositions 1 and 3 still yield the “correct” asymptotic 
conclusions, similar to (Jiang, 1996), which ensure consistency and asymptotic normality 
of 9, if p/n p E (0, oo) and the model parameters are bounded. Though it may be of 
interest to further optimize Propositions 1-3 (and it is almost certainly possible), our main 
emphasis is that the non-asymptotic approach taken here provides additional flexibility for 
deriving and understanding results in less standard settings. For instance, while Propositions 
1 and 3 parallel existing results in (Jiang, 1996), Proposition 2 is a concentration bound for 
linear models with correlated random-effects and appears to be more novel [an application of 
Proposition 2 may be found in (Dicker and Erdogdu, 2015)]. 

4 . 1 . Additional notation 

It is convenient to introduce some notation relating to the spectrum of X. Let Ai ^ • An ^ 0 

be the eigenvalues of p~^XX~^ and suppose that p~^XX~^ = UAU~^ is the eigen-decomposition 
of p~^XX~^, where A = diag(Ai,..., An) and 17 is an n x n orthogonal matrix. Let no = 
max{f; A^ > 0} and dehne the empirical variance of the eigenvalues of p~^XX~^, 
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4-2. Concentration bound 


To derive a concentration bonnd for 6 (Proposition 1 below), we follow standard steps in 
the analysis of variance components estimators (Hartley and Rao, 1967). In particnlar, we 
introdnce the prohle likelihood and other related objects, which essentially rednce the bivariate 
optimization problem ( 6 ) to a nnivariate problem. Basic calcnlns implies that if ? 7 ^ ^ 0, then 

max = max£=f(r7^), 

where = ^{c’‘*(h^),is called the prohle likelihood and 

n \ p ) 

It follows that Q = where 

if = argmax£*(77^). (8) 


The proof of Proposition 1 hinges on comparing the prohle likelihood ^*(? 7 ^) to its popnlation 
version, 

4(h^) = -]^\og{al{rf)] - ^logdet ^ log(^o) “ 

where we have replaced alijf) in with its expectation. 


(Tn 


= n<{v^)\X] = 



XX^ + I 


H 

p 


xx^ + / 


y Vo^i + i- 
n ^ p'^Xi + 1 ’ 


Observe that ctq{pq) = cJq. 

Overall, onr strategy for proving Proposition 1 mirrors the classical parametric theory 
for consistency of maximnm likelihood and M-estimators (e.g. Chapter 5 of Van der Vaart, 
2000), except that we employ Theorem 1 at several key steps. As in the standard analysis, two 
important facts nnderlying Proposition 1 are (i) Pq is the nniqne maximizer of f'o(h^) (ii) 
i*{p‘^) 4 ( 77 ^), when n^p are large. Theorem 1 is nsed to make the approximation i=^{p‘^) ^ 

more precise. It shonld not be snrprising that qnadratic forms play an important role 
in the analysis, given the dependence of •^*(h^) = fhe qnadratic form olirf). 

We emphasize that to prove Proposition 1, we use Theorem 1 with K = 1 and u = 77 ^; the 
general version of Theorem 1 with matrix functions dehned on may be useful for studying 
random-effects model with JP-groups of random-effects, e.g. the general linear random-effects 
model considered in (Jiang, 1996). Proposition 1 is proved in Appendix C. 
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Proposition 1. Assume that the linear random-effects model (4)-(5) holds and that /9i,..., (dp, 
ei,..., e„ are independent sub-Gaussian random variables satisfying 


max 


for some 0 < 7 < oo. Finally, define 

a»{Ar 




K + mvi + 

(a) Suppose that uq = n. There is an absolute constant 0 < C < 00 such that 


P < 110 — 0 nll > r 


X > < C exp 


(_n _ ^ r^ 

[ C 7^(7 + ly (r + 1)2 


for every r ^ 0 . 

(b) Suppose that Hq < n. There is an absolute constant 0 < C < 00 such that 


P S 110 — 0nll > r 


X 1 < C exp 


■(P ■ 72(7 + 1)2 


1 _ (yA\ 

n J \ n J 


(9) 


( 10 ) 


(r + 1 )^ 


for every r ^ 0 . 

For given values of erg, Pq and A, the quantity ^(cro, Po, A) in Proposition 1 may be extremely 
small. We have not attempted to optimize K{aQ, Pq,A), and the bounds in the proposition can 
almost certainly be improved at the expense of some additional calculations and a more 
complex bound. However, despite the magnitude of K{aQ,pQ, A), the proposition yields very 
sensible asymptotic conclusions. Indeed, the key property of «)((Tq, p^, A) is that ifU^ (0, 00 ) 
is compact, then 

0 < inf {^t(a 2 , 72 ^A); al,pl, Xi,..., Xn,v{A) € U} . ( 11 ) 

An immediate consequence is that \i al,pl,Xi,..., Xn, t)(A) are contained in a compact subset 
of (0, 00 ), then Proposition 1 implies that 0 converges to 0o at rate n ^/2 least when n = no; 
if no < n, then part (b) of the proposition requires the additional condition that no/n stays 
away from 1 — this is discussed further below]. 

The bounds in Proposition 1 are tighter [i.e. K{aQ,Po, A) is larger] when the eigenvalue 
variance t)(A) is large. This is related to identihability: cTq and Po are not identihable when 
ti(A) = 0, and it is easier to distinguish between them when ti(A) is large. 

The cases where no = n and no < n are considered separately in Proposition 1 because 
the large -?72 asympotic behavior of <Jo{py = differs in these two settings. In 
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particular, if uq = n, then crQ{rj‘^) “ as —> oo; on the other hand, if uq < n, then 

<jQ{rf) — (1 — no/n) as rj^ —> oo. 

Note that Proposition 1 (a) actually makes no explicit reference to p, or to the relative 
convergence rates of p and n. However, there are implicit conditions on p. For instance, since 
no = n in part (a), we must have n ^ p. Additionally, in order ensure that Ai,..., A„ are 
contained in a compact subset U c (0,oo), so that (11) holds, it may be natural to enforce 
other conditions on p, e.g. p/n —» p e (1, oo). 

Part (b) of Proposition 1 applies to settings where p < n. Note that the upper bound in 
part (b) contains an additional term (1 — no/n)'^(no/n)^, as compared to Proposition 1 (a). 
Thus, assuming that ctq, Pq, Ai, ..., A„, t)(A) are contained in a compact subset of (0, oo), we 
conclude that 8 converges to Oq at rate if 

liminf fl - —) — > 0. (12) 

Observe that (12) implies p —» oo. Hence, we need p —» oo in order to ensure that 8 is 
consistent. This is reasonable because information about rj^ = pE(/3|)/aQ is accumulated 
through /?!,... ,/3p. The condition (12) also implies that if X is full rank, then we must have 
p/n —> p < 1 in order to ensure consistency. This condition seems less natural and can likely 
be relaxed with a more careful analysis; similar challenges arise frequently in random matrix 
theory when p/n 1 (e.g. Bai et ah, 2003). 

4 . 3 . A more general concentration hound 

In this section, we investigate the performance of 8 in models where the random-effects might 
be dependent. Suppose that f3 = (/5i, ..., jSpY e is a random vector that is independent 
of e, X and let 

y = X~^ + e. (13) 

We do not assume that (3 has independent components or that each of the components has 
the same variance. We dehne the variance components estimator based on the data (y. A), 

8 = = argmax£(0), (14) 

0-2,772^0 

where 

m ^ -1 log(.^) - i- logdet (txx- + /) - (txX- + /) " y. 

The next proposition is a concentration bound for 0, which implies that the estimator may 
still perform reliably, if there is a good independent coupling for /3. 
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Proposition 2. Suppose that y, 0 satisfy (13)-(14). Suppose further that (3 = (/3i,..., I3p)~^ e 
MP is a random vector with independent components, which is independent of e,X (but may 
be correlated with f3), such that the independents random-effects model (4)-(5) and (9) hold. 
Let K) be as in (10). 

(a) Suppose that rig = n. There is an absolute constant 0 < C < oo such that 


P < 110 — 0nll > r 




+ 4P^ II/3-/3II > 


C 7^(7 + 1)^ [r + 1)2 

1 K{al,ril,K) n r 


C (7 + 1 )^ p + n r + 1 

for every r ^ 0. 

(b) Suppose that uq < n. There is an absolute constant 0 < C < 00 such that 


X 


P < 110 — 0nll > r 


X 1 ^ (7 exp 


_n _ At(ag,r/g,A) ^ 2 ^ 

C 72(7 + 1)2 V n ) V n / 


+ 4P 11/3 -/3|| > 


7^(7 + 1 )^ 

1 K{al,rjl,X) n 


C (7 + 1 )“^ p + n r + 1 


(r + 1)2 


X 


(15) 


(16) 


for every r ^ 0. 


The proof of Proposition 2 is similar to that of Proposition 1 and may be fonnd in Section 
SI of the Snpplementary Material. Observe that the hrst term in each upper bound (15)-(16) 
is the exact same as in Proposition 1. The second term in each of the bounds is new; this term 
is small, if ||/3 — /3|| is typically small. In other words, in a random-effects model where (5) 
does not hold, the Gaussian maximum likelihood estimator 0 may be a reliable estimator for 
the variance components parameter 6q = (cJq, rjf) from a corresponding random-effects model 
(4)-(5), if /3 % /3. Proposition 2 is useful for applications involving misspecihed random- 
effects models. For example, it can be used to recover some of Jiang et al.’s (2014) results 
for sparse random-effects models in genome-wide assocation studies (though Jiang et ah take 
a very different approach), and for variance estimation problems in high-dimensional linear 
models with hxed (non-random) /3 (Dicker and Erdogdu, 2015). In both of these applications, 
the predictors Xij are assumed to be random; the strategy is to leverage symmetry in the 
predictor distribution to reduce the problem to one where (3 is exchangeable and has a tight 
independent coupling, so that Proposition 2 can be applied. 


4.4- Normal approximation 

In this section, we shift our attention back to the independent random-effects model (4)-(5) 
and give a normal approximation result for 0 (Proposition 3 below). One consequence of 
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Proposition 3 is that under conditions similar to those described after Proposition 1, — 

6q) is asymptotically normal, when p/n p s [0,oo). As with consistency (discussed in 
Section 4.2), asymptotic normality of 6 has been studied previously in similar settings (Jiang, 
1996). However, the main signihcance of Theorem 3 is its flexible hnite-sample nature, which 
makes it an easy-to-use tool for applications. 

To derive Theorem 3, we again follow the standard strategy for parametric M-estimators. 
First, we introduce the score function 


SW- 




_L I 


Si{0) 

S2{e) 


V 


1 

20-4 


^ + I 

\ p 


-1 


y-^ 


V 


+ I 

P 


Then S{6) = 0, provided rf > 0. The main idea of the proof is to Taylor expand the score 
function about 6q so that 

0 = S{e) = S{Go) + J(0o)(0 - 0o) + r, (17) 

where J{0) = ^S{6) and r is a remainder term. Theorem 3 follows by solving for O — Oq above, 
then using three key intermediate results: (i) S{6q) is approximately normal, (ii) J{0q) ^ 
Jo{Oq), where 

J„(e)=E(J(0))-E|Ts(@)|, (18) 

and (hi) the remainder term r is small. Approximate normality of S{6q) follows from Theorem 
2 in this paper. The approximation J{6o) ^ Jo{do) and the fact that r is small follow from 
concentration properties of quadratic forms. 

Proposition 3. Assume that the linear random-effects model (4)-(5) holds and that /9i,..., /3p, 
Cl,... ,€n are independent random variables satisfying (9). Define 




K + + 1)^^ {P(A) + Ip 

aWo 


(19) 


let f e C^(M^), and let Z 2 ~ A^(0,/) be a two-dimensional standard normal random vector. 
Finally, let Jo(do) be as in (18), define X{6 q) = Var{S'(0o)|-^}j and define 


S' = Me„)-'i(e„)Me„)-\ 


( 20 ) 
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There is an absolute constant 0 < C < co such that 


E 


f{V^ie-9o)} xl -E{/(v1>V2z2)|X} 




C(7 + rtlK) i P](l + l/lfc) 


p + n log(n)" 


fc=i 


n 


ni/2 


+ 21/I0P \\ 0 -eo\\ > 


log(n) 


2'y/n 


X 


( 21 ) 


A detailed proof of Theorem 3 may be found in Appendix D. The quantity n{aQ, r]^, A) in 
(21) is potentially extremely large, and plays a role similar to k{(7q, Pq, A) in Propositions 1-2. 
As with the previous propositions, despite the potential magnitude of Pq,A), the asymp¬ 
totic implications of Proposition 3 are very reasonable. Indeed, assume that the conditions of 
the proposition hold. If, additionally, ctq, ^q, Ai,..., A„, ti(A) are contained in a compact subset 
of (0, 00 ) and p/n —» p e [0, 00 ), then it is clear that the hrst term on the right-hand side of 
(21) converges to 0. Moreover, Theorem 1 implies that the second term on the right-hand side 
of (21) converges to 0, as long as we have the additional condition (12) when hq < n. Thus, 
under the specihed conditions. 


E 


/jv5(0 - e„)} 






( 22 ) 


for all / e C^(M^). This is an asymptotic normality result for — 0o)- One apparent 

limitation of (22) is that it only applies for / e C^(M^). However, standard arguments (e.g. 
Section 3 of Reinert and Rollin, 2009) imply that (22) is valid for broader classes of non¬ 
smooth functions /, including indicator functions for measurable convex subsets of thus, 
we may conclude that — 0o) -h[(0, /) in distribution, where T is dehned in (20). 

We note additionally that if j3 and e are Gaussian, then T = X{0q)~^ = I]\f{0Q)~^, where 
In (0o) = (tij(0o))i,j=i,2 is the Gaussian Fisher information matrix for 0o and 


tki{0o) — 



k+l-2 


2^2(4-fc-0^ 


+ I 

p 


2-k-l' 


k,l = 1,2. 


Moreover, standard likelihood theory (e.g. Ghapter 6 of Lehmann and Gasella, 1998) implies 
that 0 is asymptotically efficient in the Gaussian random-effects model. 


5. Discussion 

We have presented new uniform concentration and normal approximation bounds for quadratic 
forms, and described some applications to variance components estimation in linear random- 
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effects models. We expect that the general results for quadratic forms, found in Section 3, 
will be useful in a range of other applications, such as variance components estimation in 
non-standard random- and fixed-effects linear models, which arise in genomics and other 
applications (Dicker and Erdogdu, 2015; Jiang et ah, 2014); hypothesis testing for variance 
components parameters in high-dimensional models; and other hypothesis testing problems, 
where the test statistics involve quadratic forms in many random variables. As discussed in 
Sections 3.2 and 4, many of the bounds in the paper can be improved, at the expense of in¬ 
troducing some additional complexity into the results. Furthermore, all of our results require 
sub-Gaussian random variables. It may be of interest to sharpen the results in the paper and 
extend them to allow for heavier-tailed random variables with sufficiently many moments. 

Appendix A Proof of Theorem 1 

The proof begins with a chaining construction. Fix a positive integer M and dehne a regular 
grid on [0,i?]^ with (2^ -t- 1)-^ points, = Um x • • • x Um, where Um = R]‘f^Q c 

[0, i?] c M. For each u = (ui,..., ukY ^ [0, R]^ and j = 1,..., M dehne = (-uy,..., UKjY 
where Uij is the smallest point in Um that is at least as large as «*; additionally, dehne ho = 0. 

Next, consider the decomposition 

C^g(u)c - E{c^g(u)c} = Ai(u) + A2(u) + A3(u), 


where 


Ai(u) = c^jO(u) - 0(i„))c - E[C‘"{<3(u) - <3(ii„)}C]. 

A2(u) = yi3(u„)C - E{<^Q(u„)<), 

M M 

A3(u) = 2 C^(0(u4 - 0(U2-1))C - F® k50(u4 - 0(U2-1))<] . 

i=i i=i 


Let ri, r 2 , rs > 0 satisfy ri + r 2 + = r. Then 

P sup |C^0(u)C-E{CT‘(3(u)C)|>r 

us[0,K]^f 


sup |Ai(u)|>r, 

' ue[0,K]^f 


i=l 


To prove the theorem, we bound each term on the right-hand side of (23). 
To bound the term in (23) involving Ai(u), observe that 


Ai(u)| < |KIPII<3(u) - <3(um)II + |tr [E(CC^){Q(u) - 0(um))] 


(23) 
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« L||u-i„||||l/Ty||||Kp + gE(Cf) 

< K^/^LR2-^\\V^V\\ (IlCf + 4^7^) , 

where the second inequality follows from Von Neumann’s trace inequality (Mirsky, 1975) and 
the last inequality follows from (2). It follows that 

P I |A,(u)| > r, I «: P vW|| (||Cf + 4 <i 7 ^) > l| 

r" 1/2 r p 

< (24) 

To bound the term in (23) that depends on A 2 (u), we use the Hanson-Wright inequal¬ 
ity (Theorem 1.1 of Rudelson and Vershynin, 2013), which implies that there is an absolute 
constant c > 0, such that 


sup |A 2 (u)| >r 2 V =P[|C^Q(0)C-E{C^Q(0)C}| >r 2 ] 

us[0,/?]^ 


< 2 exp 

< 2 exp 


-cmm 


—cmm 


_ 1 


7‘‘ll<?(0)|l?,s’7"IIW)lliJ 


(26) 


r2 


7-||Wl/p||T(0)||f,s’7^||Wl6||||r(0)|| 


(specihcally, the hrst inequality above follows from the Hanson-Wright inequality; the second 
inequality follows from basic bounds on matrix norms). 

Finally, we bound the term in involving A 3 (u) in (23). Let si,..., sk ^ 0 satisfy si -f • • • -f 
sx = 1- Then 


P< sup |A3(u)|>r3y 

M f 

^ 2 P < sup |c^{g(fij) - Q(fij_i)}C ~ E[C^{Q(fij) - Q(fij_i)}C]| > SjTs 

j = l (ue[0,K]^ 
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By construction, for each j = 1,..., M and i = 1, 
itij = k2~^R. Thus, for each j = 1,..., M and i 
(uj,Uj_i) and it follows that 


, K, there is a /c = 1,..., 2“^ such that 
1,..., K, there are 2^^ possible pairs 


sup |C^{Q(uj) - Q(uj_i)}C - E[C^{Q(uj) - Q(uj_i)}C]| > 

ue[0,R]i( 

^2^-^ max F\\C'{Q{uj) - Q{uj_i)}C-E[C'{Qiiij) - Q{uj-i)}C]\ > SkTs 


max exp 


—cmm 


o2 2 


^7^3 




exp 


-cmm 


7llQ(uj) - (5 (Vi)IIhs’ t^IIQK') - (5(Vi)ll 

2^s,r3 ) 




y^lll/T VfKL'^R^m ’ v\\Ky‘^LR j 


where we have used the Hanson-Wright inequality again in the third line above. We conclude 
that 


sup |A 3 (u)|>r 3 

ue[0,iJ]^ 

M 

^ 2 2^'^+^ exp 
1=1 


-cmm 




2^ SjT^ 1 


-i^WV^VfKL'^R^m' -i'^\\V^V\\KR^LR\ _ 


Now take Sj = (l/3)-(3/4)-^, for j = 1,..., M—1, and sm = 1 —(si + - ■ ■ + 'Sm-i) > (l/3)-(3/4)^. 
Then 


p| sup |A3(u)|>r3l 
I ug[0,R]1^ ) 

M 

^ ^2^'^+^exp 


1=1 

K 

= 

fc=l 


—cmm 


_ imyr, ] 

9ji\\V^VpKL^IPm’ Sj^V^VIKr^LRj 


(9/4)Jri 


(jK -I- 1) log(2) — cmin 


(9/4)Vi 


If 


(3/2)V3 ] 

97^ II yT y fKL'^R^m ’ || Ht y || j _ 

225K^ 


74 ||yT-j/|| 2 ;;^^ 27 ^ 2 ^ min{c,c2}’ 


( 26 ) 
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then 


{jK + 1) log(2) — cmin 


(9/4)Jri 


imyr, ] 


'ii'WV^VfKL^Wm' 3i‘lV^V\\Ky'‘LR] 


< —j log(2) — cmin 


r^i 




Hence, if (26) holds, 


P<; sup |A 3 (u)|>r 3 

ue[0,ii]-ff 


(27) 


^ exp 


-cmin 


r'i 


To hnish the proof, we combine (23)-(25) and (27), and let K —> oo, to obtain 


P 


sup |C^g(u)C - E{C^Q(u)C}| > r 

ue[0,R]K 


< 2 exp 
+ exp 


cmin 


r2 


7^||HTy||2||T(0)||^s’72||yT^|| 117^(0)1 


-cmin 




whenever (26) holds. The theorem follows by taking, say, ri = r 2 = r 3 = r/3. 


Appendix B Proof of Theorem 2 

We follow the proof of Theorem 2.1 in Reinert and Rollin (2009), and use Stein’s method with 
exchangeable pairs. Let / : ^ M be a three-times differentiable function. By Lemma 2.6 

in (Reinert and Rollin, 2009), there is a 3-times differentiable function g : ^ M satisfying 

the Stein identity 


E{/(w)} - E{/(l/i/V)} = E {V^RV^(w) - w^Vc/(w)} (28) 



1 

<• 

3‘/(x) 

nU 

" k 

n,ti 


and 
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for all X = (xi,..., X 2 k)~^ ^ k = 1,2, 3, and ij e {1,..., k}. To prove the theorem, we 
bound 

S = E{V^VVgiw)-w'^Wgiw)} . (29) 

Next, we use exchangeability. Let = (Ci, • • •, Cd)"'" be an independent copy of (, and 
let is {1,..., d} be an independent and uniformly distributed random index. Dehne the 
vector w' e exactly as we dehned w, except that Q is replaced with throughout. More 
precisely, let e, s be the i-th standard basis vector in and dehne 

= {C + (([ — Ci)^i}~’^Qk{C + (C ~ = Wk + 2(C( — Ci)ejQkC + ^lQkei(([ — QY, 

= {C + (Ci “ Ci)eJ'''Qfc{C + iCi ~ Ci)ei} — tr((5fc) = iuk + ejQkei{{CiY — Cf}, 

for k = 1 ,..., K. Then w' = ..., w'kV ^ 

Let’s compute E{w'i^ — Wk\C) and IE(u)^ — Wk\C)- Since 

w'k-Wk = 2(Ci - QejQkC + ejQkeiiC - Ci? (30) 

w'k-Wk = elQkeii^Q^^ - Cf}, (31) 


it follows that 


and 

Thus, 

where 


E(«,i - «,jic) = E ] 2 (c; - c.) L 4*’o + 4’(Ci - 6)- 


i=i 


-1 S + + ^tr(gfc) 


■ N' 

2,J = 1 2=1 

2 1 ^ 

= —-,Wk + -,Wk 


d 
2 

d d 

E«-«;i|c)-E[«f{(c;f-c|}|<] 

E(w' — w|<^) = —Ai^w, 


= -^Wk- 

d 


Ai = 


2 _ 1 
d d 
0 ^ j 


b2x2 


, = 


Ai 0 

0 Ai 
0 0 


0 

0 

Ai 


D 2K X 2K 


( 32 ) 
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Next, we will work our way back to the Stein identity (29) and take advantage of the 
identity we just derived (32). Dehne 

G'(x', x) = ^(x - x)^A^^{V^(x') + V^(x)}, X, x e . 

By exchangeability, E{G'(w',-w)} = 0. Thus, 

0 = Ie [(w' - w)"^A^'^{V5f(w') + V5f(w)}] 

= E{(w' - w)"^A^'^Vfif(w)} + ^E [(vi^' - w)"^A~'^{V5f(w') - V5f(w)}] 

= -E {w“^V^(w)} + ^E [(w' - w)"^A^'^{V 5 f(w') - Vfif(w)}] . (33) 

where we used (32) in the last step. Now we Taylor expand and use some other basic manip¬ 
ulations to get a direct connection between (29) and (33). Indeed, by Taylor’s theorem, 

(w' - w)"^A”'^{V^(w') - Vfif(w)} 

= (w' - w)TA-T V^g{w){w' - w) + (w' - w)TA-Tr(2) 

= tr [(w' — w)(w' — W)'''Aj(-'''V^ 5 '(w)] + (w' — w)'''A^^r^^^, 

where = {r^\ ..., , 

= (w' - w^R^iw' - w), 

and each = {Rf^j.) is a 2K x 2K matrix with \Rfjj.\ < (1/2)|/|3. Thus, by (33), 

E {w'''V 5 '(w)} = 2 ®"^^ ~ w)(w' — w)'''A^'''V^( 7 (w)] (34) 

+ iE{(w'-w)^AyrP>}. 

Since 

E {(w' — w)(w' — w)"'"} = 2E |w(w — w')'''| = 2E (ww'''AJ) = 21/AJ, (35) 

it follows that 

E { V'''I4V5'(w)} = -E [V'''E {(w' — w)(w' — w)'''| Aj(.^V 5 f(w)] 

= 2 ®"^^ ~ w)(w' — w)"''} A^'''V^ 5 '(w)] . 


(36) 


L.H. Dicker and M.A Erdogdu/Quadratic forms and variance components estimation 


20 


Combining (29) and (34),(36) yields 

S = E {V^VVg{w) - w^Vg{w)} 

= 2 ®"^^ ~ w)(w' — w)"'"} AjJ V^ 5 '(w)] 

- ^Etr {(w' - w) (w' - w)TA^t V2^(w)} - ^E {(w' - } 

where = —(l/2)Etr [TAj7V‘^g{w)'^, S 2 = —(1/2)E {(w' — w)'''A^^r(^^}, and T = E{(w' — 
w)(w' — w)'''|((^} — E{(w' — w)(w' — w)"''}. Thus, in order to bound S it suffices to bound 
SuS2. 

First, we work with Si. Notice that 


|Etr{TAyv"g(w)}| « ||„sE(||r||„s) 

_ k!2^tr{(A[A.)-‘}‘/=E(||r||„s) 
< g/C'^dl/ljEdlTlHs), 

where we have used the fact that tr{(A7Ai)~^ = {3/2)(P. Thus, 

|Si| < 4A'V2d|/|2E(||r||Hs). 

It requires a bit more work to bound E(||T||hs) in (38). 


The matrix T can be written as 



T = 

Tn Ti2 ■ ■ ■ Tik 
T 21 T 22 ■ ■ ■ T 2 K 

, where Tu = 

^kl fkl 
^11 ^12 
^kl 4.kl 
^21 ^22 


Tki Tk2 ■ ■ ■ Tkk 



(38) 


til = U!k)iw'i-wi)\C}-E{{w',,-Wk){w[-Wi)} , (39) 

tu = - Wk){w'i - w/)|C} - - Wk){w[ - iJii)}, (40) 

= E{(u5fc - Wk){w'i - uj/)|C} - E{(ui;. - Wk){w'i - a;;)}, (41) 


and 
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*22 - E{(i51 - - i5i)|C) - - w,)}. 


(42) 


We conclude that 


E(||T||f) < 


K 2 


1 1/2 


n i; E{ (*/')"} 

_k,l=li,j=l 


and, furthermore, if we can control each of the terms then a bound on E(||T||i?) will 

follow. Fortunately, Lemma S2 from the Supplementary Material gives bounds for on these 
moments. Indeed, let 0 ( 7 ) = 4096(7 + 1)* Qmax = maxk=i^...^K ||Q||- It follows from Lemma 
S2 that 


E(||r||^) ^ 




< 


< 


dV2 

dV2 

dV2 


[8{108c(7)2 + 763c( 7) + 930} + 4{24c(7)2 + 69c(7) + 1} + 0 ( 7 ) + 4] 


1/2 


■ {960c(7)2 + 6381c(7) + 7448} 
■{65c(7) + 104}. 


1/2 


Combining this bound on E(||r||i?) with (38) yields 

|Fi|^4{5c(7) + 8}iF3/V/2|/|2d 

Next, we bound 5*2. First consider the basic inequalities 


(43) 


l'S '21 ^ -||A^^||E (||w' - w||||r(^)||) 


d 


r 1 0 ■ 

1 2 

E4 

/2K \ ^/2 

iiw'-wpW||fi«iid 



V-==i / J 


< 


^ (||w' - wf) . 


(44) 


Now focus on bounding E(||w' -- wp). Each inequality in the following chain is elementary: 


E(j|w' — wp) = E 


= E 


K 


K 


3/2 




+=1 

K 


fc=l 


X;(2(c; - QejQti + ejQte,(C - Ci) 


2\2 


fc=l 
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K 


3/2 -1 


+ 




k=l 


< 


K 


K 


3 / 2 ' 


8 SlK)’ + C/}(eIOiC)" + 91] ll<3/.l/{K)‘ + C‘} 


fc=l 


fc=l 




d 


i=l 


K 


K 


3 / 2 ' 


8 2f(C')" + C/Ke/OtC)'^ + 92 llOill'^f(C')* + C) 


fc=l 


fc=l 


< + mAQ.cn + iiod/E(cf 


d 


i=l fe=l 


i=lfc=l 


d K 


« <3^0“) + 300A'»/^c(7),; 

i=l fc=l 


ci 


3 

max' 


It remains to bound E{ (e^ QkCY}- This is accomplished by a version of Khintchine’s inequality, 
given in Corollary 5.12 of (Vershynin, 2010). It implies that there is an absolute constant 
Cl > 0 such that 

( d "'3 

Ej(eTQiC)') < Cf (7 + 1/ I 2(4*’/ 


.1 = 1 


Thus, 


3/2 


E(||w' - wf) « 3°°Cg”M7)”-(7+l)^ I I ^ 

k = li = l b = l J 

«+1)” + c(7)}<;/„. 

Combining this with (44) yields 

IS 2 I « 266Jfyi„{Cic(7)”’'(7 + 1)” + c(7)}|/M. 

Finally, combining (28)-(29), (37), (43), and (45), we obtain 

|E{/(w)( - E{f(V'/^z)}\ g. 2{5c(7) + SiK^/V/^hql^ 


3 

max 


(45) 


+ 133/i'®{C,c(7)72(7 + 1)“ + c(7))l/l3<i9: 
< (7(7 + 1)* + K^fUqi^) 


3 

max 


for some absolute constant C > 0, which proves the theorem. 
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Appendix C Proof of Proposition 1 


The proof of Proposition 1 is based on several lemmas, which are stated precisely and proved in 
the Supplementary Material. Several of these lemmas (Lemmas S3, S5, and S7) are basicallly 
corollaries of our uniform concentration bound for quadratic forms. Theorem 1. 

To prove the proposition, hrst let r ^ 0. Since = criitf) and erg = crQ^r]^), it follows that 


|j|0 - 0o|| >r^ = {{f- vlf + (<5-^ - 

^ ^ I sup \al{r]‘^) - al{ri‘^)\ > ^ 

t VJ |^0^77^<oo 2,^1 L 


Additionally, since 


ll 1 
1=1 


Vo^i + 1 


772 Ai + 1 


- 1 


cr. 


W 


Vo\ 


§ V^Xi + 1 


< ^ 0^1 If 


f I 


we conclude that 


0-0o|| >r \ ^ \ \fi -%\> 


2V2(c'‘g + l)(Ai + 1) 


sup ) - (Toiv )\ > 


0^r]^<a3 


2 V 2 


(46) 


We bound the probability of the two events on the right-hand side in (46). 

Bounding the probability of the second event in (46) is easy, thanks to Theorem 1 and 
Lemma S3. By Lemma S3, there is a constant 0 < Co < 00 such that 


P 


sup If (f)-fT^(f)| > 

0!£r;2<oo 


r 

2 V 2 


X 


< Co exp 


—nu{A)‘^ 

Co7^(7^ + 1) r + 1 


(47) 


where a;(A) is dehned in (S27). 
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Bounding the probability of the hrst event on the right in (46) takes more work. In fact, 
we further decompose the event as follows: 




2-\/2{aQ + l)(Ai + 1) 




(48) 


where 


A- = 


+ l)(Ai + 1) 
yl+ = ) \ > 


< \ f -Vo\ < 


bo 


bo 


2\[2[ci^ + l)(Ai + 1) J 


2'\/2(crg + l)(Ai + 1) J 

To bound P(74“), we use properties of the prohle score function 




'L 

drf 

= -y^ {+ I 


n \p 


P 


(49) 


- <^2(y)hr i (ixA'd ( AxX^ + / 


n 


P 


P 


In particular, let 44(ri, r 2 ) = {There exists r]^ > 0 such that ri < \rj'^ — ? 7 q| < r 2 and 
0} and observe that 


bo 


12'\/2(c’‘o + l)(Ai + 1) 2'\/2 (o'q + l)(Ai + 1) 


Furthermore, 


yl(ri,r 2 )c<^ sup \H^{Tf) - Ho{Tf)\^ inf \Ho{r]‘^)\} , 

l0s:»?2<oo 

where H^irf) = E{//=f(? 7 ^)|X}. By Lemma S4 in the Supplementary Material, if 


2\/2(c’‘q + l)(Ai + 1) 


< Ib^ -bol ^ 


bo 


2'v/2(o'q + l)(Ai + 1) 


l^o(b^)| > 


a^r 


32'v/2(crg + l)(Ai + l)®(?7o + 1)^ 


KA). 


then 
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Thus, 


A c .j sup 

O^rj^<oo 


> 


air 


32V2(c’‘q + l)(Ai + l)®(/7o + 1)^ 


0(A) 


Now we can use Lemma S5, which is an application of Theorem 1, to bound the probability 
of the right-hand side above. We conclude that there is a constant 0 < Cf < oo such that 


'|X) ^ exp 


nan 




C*! 72 (-y 2 7 l)(cro + l)^(?7o + + 1)^® '^(^) + 1 T + l 


(50) 


To bound P(A'''|X), we consider cases where n = no and n < no separately. First assume 
that n = no- Lemma S6 (a) from the Supplementary Material implies that 


A-* 


4(ho^)-4(f)> 
^o{Vo)-^*iVo 


> 




8{al + mx, + mvl + ly 

16(a2 + l)2(Ai + mr^l + 1)2 

2n ^ ^oX(^o>^)o(A) 


sup \ioiv ) - )\ > 

0 ^ 7 ]^ <CO 


16 (a 2 + l)2(Ai + 1)2(72 + 1)2 


16(^2 + l)2(Ai +1)2(72+ 1)^ 


Next we apply Lemma S7, which depends on Theorem 1. Lemma S7 (a) implies that there is 

^oX(ho>A)o(A) 


a constant 0 < Ci < 00 such that 


P(A+|X)<P< sup \io{'n)-^*{v)\> 


0^ri^<oo 


^ exp 


n 


16 (a 2 + l)2(Ai + 1)2(72 + 1)2 

alvMAyxivlAf 


X 


o(A)^ 


0 + nx^ + i){ai + mvi + imi + ir { o ( a ) + i)pj 


(51) 


Part (a) of the proposition follows by combining (46)-(48) and (50)-(51). 
To prove part (b) of the proposition, assume that no < n. Since 


o(A) ^ A: 


1 - 


no\ no 


n 


n 


the inequality (50) implies that 
P(A“|Af) < Cy exp 


ncr^(l - no/nf{no/n) 


+ l){al + 1)2(72 l)8(Ai + 1 ) 16 (Aa 1 + 1)^ r + ij 


(52) 
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Additionally, by Lemma S6 (b), 


4+-^ mo E (n^)\ - xivl - no/n){no/n)r]l 


Hence, Lemma S7 (b) implies that there is a constant 0 < C 2 < 00 snch that 

x(ho> A)(l - no/n)(no/n)77^ 


P(A+|A:)<p| snp I4(h^)-4(b^)| > , 2 , iA2m I 1^2^ 2 , ia 2 

[ 0s:7?2<oo 16(crg + l)^(Ai + iy{r]^ + 1)^ 


a: 


(53) 


^ C'2'' exp 


n 


(^oVoX{vl^)‘^{^? 


1 _ //H] 

c+ + I)(a2 + I)3(j^2 + 1)4(A^ + 1)4 V nJ \n) 


Part (b) follows from (46)-(48) and (52)-(53). This completes the proof of Proposition 1. 

Appendix D Proof of Proposition 3 

Proposition 3 is a direct application of Theorem 2, in conjnnction with some basic Taylor 
expansions. However, keeping track of all the qnantities to be bounded does require some 
effort. Let / e By (17), on the event that 77^ A 0, 




6 — 9o — —Jo{0o) S{9 q) — Jq(6q) {J{9o) — Jo{9o)}{9 — 9q) — J()(9q) r, 

where r e is a remainder term. Furthermore, by Taylor’s theorem. 


r = 


1 

' {9 9oV - 


0 

_1 

2 

0 

_1 

i&S2(0') 

} {0 Oo) 


and 9* e is on the line segment connecting 9 and 9q. Thus, dehning w = — 9q) and 

applying Taylor’s theorem again, 

f{w) = f{-^Jo{9o)-^S{9o)} 


^fJo{^o) {J{^o) ~ Jo{&o)]{G ^ ^0) + ^fJoi^O 




where r^ e and ||rj|| ^ \/2|/|i- Now dehne the event 

2 log(n) 


A= -IW - 0nll < CTn 


2^/n 
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and let 1a denote the indicator of A. Then 


E{/(w)|X} = E{/(w)1a|X} + E{/(w)l^c|X} 

= E [/ {-V^Meo)-^S{6o)}\X] - E [/ {-v^Jo(0o)^'^(0o)1a4| X] 



+ E{/(w)1a.|X} 


and it follows that 


|E{/(w)|X} - E{/(T 1 /^Z 2 )|X}| ^ Ai + A 2 + As + A 4 , (54) 

where 


Ai 

A 2 

^3 

A 4 


|E [/ {-v^Jo(0o)-'^(0o)}| X] - E{f{^^/\2)\X }\, 
|E{/(w)Uc|A}| + |E [/ {-v^Jo(0o)-'^(0o)} ^]| , 


E 


■MOo)-^{J{eo)-Meo)}{e-eo)h 


X 


v^|E{r}Jo(6lo)-'rl^|X}|. 


To prove the theorem, we bound Ai, A 2 , A 3 , A 4 separately. 

To bound Ai, we use Theorem 2 with A" = 2, x 1 -^ /{ —Jo(^o)~^x} in place of /, ^ = 
(y^/3"^/ro, e^/aoy e W^+p, and 


Qi 

Q 2 


2a^ 


2a2 


Vp 

(To/ 

r^X^ 

Vp 

(To/ 


AA^ + / 


p 


-AA^^ f^AA^ + I 
P J \P 




" 0 / 


Since 



(^g + l)(^g + l)(Ai + l)^ 
2(jl^Jn 


Theorem 2 implies that there is a constant 0 < Co < 00 such that 


Cain + l)^((Tg + Ifivi + l)^(Ai + 1)^ ^ p + n 

aQ vAA 

.{||j„(e„)r=' + i}(|/b + i)(|/|3 + i). 


(55) 
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Next, we bound l|Jo(^o)|| First observe that 
Joi0) =E{Ji9)\X} = 

It follows that 


1 _ O'o V" + ^ 

.2 
0 


n Ai(?7gAi + l) 1 


2 ( t 4 i 


2ji=l 


_ -^n __ 

= 1 (r)2Ai + l)2 2n 2 _Ji=l ^ 2 ^ Zlji=l 


^0 

2 (T‘^n 

A? 


o'o Ai(?7^Ai + l) 

Zji=l 


UpAl+Tp" 

g-Q Af(rjgAi + l) 

(j2y2 Zj2=1 


det{Jo(0o)} — 


A? 




A, 


> 


1 ^ (Ai - Aj)^ 

+ l) 2 (r; 2 Aj + 1 )^ 

’p(A) 

+ i)4(A^ + 1)4 


and, since each entry in Jo{6o) is bounded in absolute value by (cTq + l)^(Ai + 1 )^/( 2 (Tq), we 
conclude that 

4 


\M0c 


ii-i 


< 


KA) 


K + mv^o + l)^(Ai + 1)^ 


Combining (55)-(56), there is a 0 < Ci < oo such that 

(7 + 1 ) 8 (( t 2 + l)9(r/2 + + 1)^^(|/|2 + 1)(|/|3 + 1) P + n 


Ai 


KA)%o^ 


n 


3/2 


Bounding A 2 is straightforward. Since |/(w)| ^ |/|o, it follows that 

A 2 < 2|/|oP(A'=|X). 


(56) 


(57) 


(58) 


Now we move on to A 3 . In order to obtain the desired bound, we need to do a little bit 
of preliminary work. We begin by bounding E{ \\J{0o) — < 70 ( 00 ) 11^1 77}. Let Jki{d) denote the 
kl-th element of J{0) and observe that 


J(0) 


MO) MO) ■ 

< 721 ( 0 ) < 722 ( 0 ) 

1 _ 1 spn yf 1 y-in Myf 

2cr‘l t7®n Aj=l 20-4)1 Ai=l 

1 ^iVi J_ V” _3_ V” 

2o-4ii 2 lJi=l + 2n Ai=l a^n Ai=l 
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where y = {yi,..., yn)~^ = U~^y- Since the operator norm is bounded by the Hilbert-Schmidt 
norm, 


E{||J(0o)-Jo(0o)ir|^}<E{||J(0o)-Jo(0o)|lL|^}= 2 Var{JfcK0o)|^}. (59) 

k,l^l 

The variances on the right-hand side in (59) can be bounded using Lemma S8 from the 
Supplementary Material, since each term is the variance of a quadratic form. Indeed, 

VtiT{Meo)\x} = v^iiCQkiClx), 

where C = (Ci, ■ ■ ■, Cn+p)"^ = /tq, e^/ao^ e R^+p, and 

1 


Qn — 


aln 


(roM/ 2 )^T 

(To I 


Qi2 — Q 2 I — 


1 


2(7072, 


P 


(roM/ 2 )XT 

(TqI 


XX^ + / [ {toM/^)X aol ] , 


p 


xx^ 


Vo 

P 


-2 


XX^ + / [ (ro/pV2)x aol ] , 


Q 22 = 


Gon 


(roM/2)^T 

Go I 


P 


2/2 
Vo 


P 


-3 


XX^ + / [ {to/p^/^)X goI ] . 


By (9), E(g) < l^Tiri + lY/{ylGl). Additionally, 


llQiill) IIQ12II) IIQ22II ^ 


(^o^ + ir(%^ + l)(Ai + l) 


CToTl 


Thus, by Lemma S8, 


Var(C^gfczClA^) < 


n + p { 16(ao + l)®(?7o + 1)‘^(7 + l)^(Ai + 1)" 




(To Vo 


Combining this with (59), we have 


, A:,/= 1 , 2 . 


E{\\J{9o)-MeoW\x} 


< 


n + p ( 64(0-0 + l)^(? 7 o + 1)^(7 + l)^(Ai + 1)" 




(Pq'^Vo 


(60) 


Thus, by (56), (60), and the dehnition of A, 

23|/|i( 7 + l)^(^o^ + invl + l)^(Ai + 1)^ 


A, < 


(PqVM^) 


n 


log(ll). 


(61) 
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To bound A 4 , we need a preliminary bound involving r. By some basic manipulations, 


E(lr||l^|.Y)<iE 






Si{e*) 


+ 






s2{on 


We-eofiA 


X 


< 


V2 


r f 


2 

] f 

d^ . 

2 

11 

E<^ 

80 ^ ^ ^ 

1a 

A V +E<^ 

082(0*) 
80 ^ ^ ^ 

1a 

"}| 


1/2 


52 


a^n 


Now we find the derivatives 
M6) = 

-^ 82 ( 6 ) = 
and observe that on A, 


Thus, 


• E( \\e-0o\\HA 


X 


1/2 


3 '^ri Vj _]_ 1 ^iVi 

(T®n Aii=l ri'^X^+l (76 a^n Aji=l 

1 ^iVi 1 V'” 

a^n (r; 2 Ai+l )2 cr^n 

-2 


■6 a^n 
1 

■In 2lJi=l (r)2Ai + l)3 

1 V"' 1 

2 —ii=l ppX/TTp' 0-4n Ai=l (rpA/TTp' 

_ V” 3 V” — 1 V” 


cr°n 
1 sr^n 
a^n 


d^ 

2 

d^ 

0^1 6 > 

d 02 ^ ^ 

1 

,, 2 S 2 (r, 


" , 5888(^2 + i) 8 (a, + 1)8 / ||y||‘-\ 

<rJ 8 V nX 


1/2 


E(||r||l^|Jf) « 77(<rg + l)8(Ai + 1)8 |^ ^ i_j.,||y||4|7f) 

^0 in 

1/2 


Efiie-OoW^A 


X 


1/2 


< 


77(2rg + 1)8 (Ai + 1)8 r ^ {a;e(||p‘/^/ 3||8) + E(||e||‘)} 
(Jq I n 

/ - \ 1/2 
■ E we-eXu x] 


< 


< 


•E(|| 0 - 6 )o||"lA|-y 

621(7+ l)^(ag + l)^(r/g + l)(Ai + ir P + ' 

/ - \ 1/2 

•E(||0-0oriA Xj 

156(7 +l) 2 (cTg + 1 )^( 7 ^ + l)(Ai + 1 )^ P + n 
-4-log ^ 

CTg n 2 


( 62 ) 
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Combining this bound with the dehnition of A 4 yields 

883|/|i(7 + 1 )^( 0 -^ + lYivl + P + n 


Aa < 




n 


3/2 


log(n)' 


(63) 


Finally, the theorem follows by combining (54), (57)-(58), (61) and (63). 
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Supplementary material: 

Flexible results for quadratic forms with 
applications to variance components estimation 


Sl. Proof of Proposition 2 


To prove Proposition 2, we begin by retracing the steps of the proof of Proposition 1. Following 
the proof in of Proposition 1 in Appendix C, we have 


e-eo\\>r\^-{ sup \a^{r] ) - a^ir] )\ > 


0 !£r)^<oo 


2 V 2 


" ^ " 32V2WI H- Srim + 1 )^ 

0!£7?2<oo 16(crQ + l)^(Ai + + 1)^ 


where we have adopted the notation from Appendix C, except that a tilde indicates all of the 
y’s in the corresponding quantity are replaced by y = X/3 + e. We further decompose the 
event {||0 — 0o|| > r} and obtain 


III 0 — 00 II > rj E vj {El n Eo) u {E2 n Eo) u {E^, n Eo n E^,) vj El u E'l (SI) 

where 


E=l sup \ai{ri^)~al{ri^)\> 


0^ri^<ao 


4 V 2 


u < sup \H^{rj^) - Ho{'if)\ > 


(ylrX){K) 


0^ri^<tx> 


64V2K + i)(a,+ l)J^(,2 + l)-i 
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E.-\ 

sup \al{rf) 

r 

B2 = | 

sup \HYY) 

1 0^r]'^<co 

f 

£ 3 -] 

sup 14(7^) - 

1 0^ip<ao 

E, - 1 

(4 n"*^" 

/■ 

E..\ 

I sup \al{iY) 


4V2 ’ 


agro(A) 




64V2(ag + l)(Ai + l)®(?7o + 1)'^ )’ 
32(^2 + l)2(Ai + + 1)2 


(Tn 


0^r]^<(X) 


To prove the proposition, we bound the probability of the various events on the right-hand 
side of (SI). 

First, from the proof of Proposition 1, it follows that there is a constant 0 < C < oo such 
that 


n t)(A)^ 


C 72 ( 72 - 1 - 1 ) {ti(A)-I-1}2 (r 4-1)2 


P(^|X) < Cexp 
¥{E\X) < Cexp 
To bound P(i?i n Eo\X), note the inequalities 

1 ,%T f 


n K{al,ril,K) 

no^^ 4 ^no ^^2 

J.2 

C 72(72 + 1 ) 

\ n / V n / 

(r -f- 1)2 


, if no = n, (S2) 

, if no < n. (S3) 




< 


-XX' + I 


p 


-1 


y —y 

n 


1 ,T f V 


-XX^ + / 


p 


{(3 - (3yx^ ( ^ATX^ + l] X{f3 - fi) 


+ 


P 


-{I3-~^YX^ (^XX^ + I 
n \ p 


^^\\p-gf + PEEliyiii3-gi 

n n 
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Thus, on Eo 


< 4 (Ai + l)(a^ + - M (l + 11/3 - 3ll) 

and it follows that 

£, n £„ e 1 1/3 - 3|| (i + ||/3-3||)> 


16\/2(Ai + + l)i/2(,2 + 1)1/2 + p 


n 


^ ii/3-3ir> 


n 


2 ^2 


1024(Ai + l)^(o-Q + l){r]Q + 1) \n + pJ 1 + r 
Hence, there is a constant 0 < Ci < oo such that 

I ~ 1 71 T 

nE, n E.|X) ^ P 111/3 - /3|| > - . «:(ao^ A) — ■ 

Next, to bound P(i ?2 n Eo\X), 


(S4) 




(S5) 


< 


-(y-yf -XX 
n \p 


t\ (T 
P 


+ -(y - y)^ 

n \p 

2,„2 , 2 Afp‘/^ 


XJfT + / (y _ y) 


P 


-XX' +/ 


-2 




^ ^ll/3-/3ir + - Wfl - M\\y\\ + XiWlir]'^) - al{p‘^)\ 


n 


n 


s— !^||/3-/3p + -t-^||y||||/3-/3||. 


n 


n 


Hence, on Eo 


mv^) - ^ A 


p + n 


n 


and 


) (A. + lf(al + + 1)1/2||,3 - ^1 (l + 1/3 - ^l) 

(Tgro(A) 


0^??2<oo 64'\/2(crQ + l)(Ai + 1) (?7 q + 1) 


n Eo 


c 11/3-311 1 + 11/3 


> 


alx>{X)r 


256V2(a^ + l)3/2(r/2 + l)9/2(Ai + 1)^ \pn 


n 
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(Tn 


n 




262144(a-g + l)^(? 7 o + + 1)^'^ \p + nJ 1 + t)(A) 1 + r j ’ 

Consequently, there is a constant 0 < (^2 < oo such that 

t)(A) r 


~ 1 TX 

F(E, nE,\X)^FU0-m>^^- --ot A) ■ — ■ ^ ^ ^ ^ 


Now we bound P(-E '3 n i?o n Ei,\X). Note that on the event E, 

2 \ —1 

L 

P 


- y ( -xx^+1 


y ^ 


n{rf\i + 1 ) 


yir^ 


CTn 


A{rf\i + 1 ) 


It follows that if we are on the event i^o n i?*, then 


|4(h^) -4(h^)| < - 






log 


Ml 
m] 




+ 




^iiv) 


(Tn 
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6p 

a^n' 


11/3-3ir+ 


12 AfpV^ 

a^Xn^ri 


lly||||/3-3ll 


40 \ ^/2^1/2 

« ^1/3 - ;3|r' + ;/ {rilx, + 1)1/3 - ail 

^ 0 ^ Xno n 

48(ag + l)(ho + l)(^i + + 1)^^^ P 


6p 


< 


(To 


n 


11/3-/3|| 1 + 11/3 


Thus, 


E^ryE,r.E.<^\\\f3-(3\\{l + \\^- 


> 


vWoX{ph A)D(A)n/(p + n) 


1536(^2 + l)3(Ai + l)+2(r^2 + 1)3 

IIP ^11 vt(Tlx{plA)n/{p + n) P(A) ] 

1536(a2 + l)3(Ai + l)+2(r^2 + i)3 ’ 1 + o(A)V2 J 


and there exists a constant 0 < 6*3 < 00 such that 

P(B3 n n B.IX) < P I lla - ail > + ■ « 
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(S 6 ) 
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Finally, it remains to bound ¥{E^\X) and ¥{E'^\X). A bound on the former follows directly 
from Theorem 1, which implies that there is a constant 0 < Co < oo such that 

n K{al,ril,X) 


¥{E:\X) ^ Co exp 


(S 8 ) 


Co 7^(72 + 1 ) 

Bounding F{E^\X) is also easy: Just replace rj{A-\/2) with o-q/S in the dehnition of Ei and 
use (S4) to obtain 


r ~ 1 71 

F{E‘\X) < P 1/3 - /3|| > — . 4al A) ■ —— 

o* p ^ n 

The proposition follows by combining (S1)-(S3) and (S5)-(S9). 


(S9) 


S2. Supporting lemmas 

Lemma SI. Let he as in (39)-(42), from the proof of Theorem 2. Additionally, let = 
E(C™'). Then 

Ad 


,kl _ 
^11 ~ 


Yj Ci 0 ( 1 + Cm) (Imi dmj + Y Cj K - Cf) Qu + Qu^ dij ) 
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m=l 

^ fi Xi {(Ci^C| - 1) + (C| - 

d 




1 


+ ^ SKCf - hf ) - 2(C^ - C, 


i=l 


^12 = 1 2 Ci(h® - Ci + Ci)q^\u + - 2 (C^ - l)}qfq^i , 


d, ■ . , 

2 . . (3) 


i=l 
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^21 = J Y CjChi^^ - Ci + CDqu^qfj + ^- 2(C^ - i)]qfqfi, 

^22 = \ Zi{(C^ - - 2 (C^ - l)]q^^qu- 


2=1 


Proof. Let Wk = (to^, Wkj ^ and write 
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Vn ^2 
V 21 V 22 
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VkI Vk2 ■ ■ ■ VkK 
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(511) 
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where 


Vm 


Then (35) implies that 


E{wkWi) E{wkWi) 
E{wkWi) E{wkWi) 


E{wkwJ). 


E{(w' — w)(w' — w)"''} = 2 


WiAl 

W2A7 •• 

ViirA^ 

f"2iAr 

V22A7 

V2K^ 

Vki^I 

Vk2^ ■ ■ 

■ hiC'ic'A^ 


Then 


i'll = E{(!(;(, - Wk){w[ - - -;:{2E{wkWi) - E{WkWi)}, 


d' 


ti2 = - Wk){w'^ - Wfc)|C} “ ^E{WkWl), 


d 


t^i = E{(t&fc - Wk){w'i - Wz)|C} - -;:{2E{WkWi) - E{WkWi)}, 

d 

^22 = - Wk){wi - h;z)|C} - ‘^{wkWi). 


(514) 

(515) 

(516) 

(517) 


Proving each of (S10)-(S13) is now a straightforward, yet tedious calculation, though (Sll)- 
(S13) are equivalent. We begin with (S13) (the simplest identity) and work our way up to 
(SIO). 

We have: 


<22 “ - 'ik)(w[ - ti'iZIC} - -jHiim) 

= E [{(CO" - Ci)"(e0O*e,)(e0O,e,)|C] 

- + ^E(C'<OPC)E(C^Q?C) 


= E [{(CO" - C"}"(eOQ*e.)(eO<?,e,)|C] - -E 




_*J = 1 


ii 


5 1;E [{(CO" - C."}"IC] g<09''’ -1 

i=l i=l 

^ d, r) d 

i (A\ (k\ (n / 


+ -tT{Qk)tT{Qi) 

d 




i=l 


i=l 


i=l 


(fc)^(O 

ii 


i=l 
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- 2(C^ - l)}qu^q, 


d 


ii ■ 


2=1 


Thus, (S13). Next, we prove (Sll): 


*12 ” ~ 

- E [2(c; - Ci){(c;)" - C|}(eJO.C)(eJO,eO|C] 

+ E [(C' - Ci)X(C;)" - C|}(ej0te,)(ej0,eJ|C] 

- 3E{(c^q,c)(C^Q°C)} + 3E(c^g,c)E(C^gPc) 


d 


d 


d 


^E [(c: - 0){(C')^ - Cn(e7Q;^C)(e7Q.e,)|C] 


2=1 


+ [K.' - - C?)(e:<3»e.)(e.:*Q,ei)|C] 


2=1 


2 Zi + § Z 




*J = 1 


d 


Se [{(C')® - iCVQ - C'C? + CfMQki)(eJQ,e,)\(] 


2=1 


+ [{(CD* - 2(C.')"Ci + 2C'C? - C;‘)(ejQte,)(eT<5,e.)K] 


2=1 

d 




(fc) JO 

22 Hjj 


2=1 


z - 0 + Cf) ^ Z(0^^ - 20 J - Ct)<ik 


3^Jfc)J0 ^ ^ V^,(4) _o„(3)2 


d 


*J =1 


d 


(1) 
Hi 


2=1 




2=1 


5S(Ci*.f'-ci= + cM\<'' + § E C2(4 >-c. + c?)4’4 


2=1 


d 




+ i2(4'-2fif>ci-tf)«f4 


2=1 
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i=l 

1 V ^ ^ 1 V;^a4 _ „(4)^ 

d 


Yj - Ci + C!)q^Uu + 2 ~ i)}4^ 


(fc)^(O 

a * 


i=l 


The identity (S12) follows immediately from (Sll) by symmetry. Finally, we prove (SIO): 


= E{{w'f, - Wk){w'i - tyOIC} - ^IE(tyfcWz) + -E{wkWi) 

d d 

- E [4(c; - C.)"(e7QiC)(e7Q,C)IC] + E [2(C; - C.)'’(e7QiC)(e7Q,e.)|C] 

+ E [2(C; - Ci)"(e7Qte,)(e7Q,<)|C] + E [(g - C.)‘(e7Q»ei)(e7Q,e,)|C] 

- ^E{(C‘"Ot<)(C‘"ftC)} + ^E(CTQtC)E(C^Q,C) 


+ jE{(C‘"Ot<)(C‘"Qr<)} - jE(C‘"QtC)E(C‘"QPC) 

[K: - Ci)"(e7e/.C)(e7QiC)IC] + §IiE[(C' - Ci)“(e70<=C)(e70iei)l<] 


2=1 


d^ 


2=1 


d 


+ [«.' - Ci)'‘(e7Q.e.)(e7Q,C)K] + ^Te [(C.' - Ci)‘(e7Qtei)(e7Q.e.)|C] 

i=l i=l 

* 2 E(C.OUC„)«'7’9tl + § 2 E(C.0C7)9«’9t!„ + 5 2 

*J =1 


d 


i,j,m,n=l 




-|]E [{((')" - 2C'Ci + C7)(e7QtC)(e7Q.C)IC] 


2=1 


+ jT® [{(C.')“ - 3(0% + 3C.'C7 - <:!]{{%QkOiejQiO + (e7Qte.)(e70,C))K] 


2=1 

d 


+ 5 He [{(C.')‘ - MO% + 6(0% - %<3 + C7)(e7Ote,)(e70,e.)|C] 


2=1 


2 Y Y ®^(CKiCmCn)glf -2 Y + 2 Y 


l^mi^n^d 2,7 = 1 


d . 




*J = 1 


d 


Yji^ + Cl)(ejQkC){ejQi<) 


2=1 
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+ - Ki - Cf){{eJQkC){eJQiei) + {ejQkei){ejQiC)} 


d 


i=l 

d 


+ 




d 


i=l 




2=1 

A Yj + ChmUmj + § S ~ 

2 


d 

*,j=i 


+ 


d r\ d 

(.k)(l) ^ 


- irifQ + 6C? H- - 2 I] ,lf 4' - i 


d 


2 = 1 


d 


Y c.od + + § E C4f-f’-3c.-c?)(«<f«‘;>+9f4>) 


^ • 1 

ij=i 1=1 

3m»J 0 , „(fc)^( 0 ^ 


d 


m=l 


d 


l^i^j^d 


+ 1 E {(c?cj-i) + (c|-i)}9'f4’ + iEf(c‘-i)-2(c?-i)k'.‘’«''’ 


d 


l^i^j^d 


2=1 


The identity (SIO) and the lemma follow. □ 

Lemma S2. Assume that the conditions of Theorem 2 and Lemma SI hold and let 0 ( 7 ) = 
4096(7 + 1)®. Then 


E{(tt')n ^ 
E{(41)n ^ 
E{(4)n ^ 


8{108c(7)^ + 763c(7) + 930} ^^^^^^,^^^^^^, 
2(24c(7)- + 69c(7) + 1 } ||^^||.||^^||. 


0 ( 7 ) + 4 
d 


iiQ/^iriiQ/ir- 


(51 8 ) 

(519) 

(520) 

(521) 


Proof. First note that (2) implies Ed^jp) < c( 7 ), m = 1,..., 8 . This moment bound for the 
Q will be used repeatedly below. Each bound in the lemma follows from a direct calculations. 
However, (S18) is substantially more involved than the others; we save this bound until the 
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end. First, we derive (S21). We have 


E((*S)4 = J 2 Se [{(cf - d*’) - 2 (c? - 1 ))"] 


i=l 

d 


< 


d? 

1 




2=1 

d 


2=1 

0 ( 7 ) +4 \\2\\r^ Ii2 


< 


d 


■WQkVWQiW- 


Next, we prove (S19). Let ..., and, for matrices A, B, let Ao B denote 

their Hadamard product. Then 

e{( 42)4<4 L L E{oc«(4’'-c. + c?)(#<7-Cm + 0}4‘’9«9M„ 


2 

1 ? 


< 


+ 52 L® [((c" - '■4) - 2 (cf - 1 )}"] (4*’4'’)- 

2=1 


1^0^ j^d 


-0 + Cf) E 


+ 


SJ ' Sj 


2 E]ci(/i®-o+a 


“ 0 + c 


(fc) ( 0'|2 

ij ’dij ) 


d^ 




ij Hij ) 


+ ^ S {Cj - Ci + Cf){yy - Cm + C™)} 


d 


+ 




i=l 


32 


V (3) (3) (fc) (0 (fc) (0 
^ Zj W 9mi 




+ 4 L {4'' + 477-34''-4^' + 3(4’7 + 2)(,1|>47 


+ 


d 2 

2 


d 


4L(d'”-44'' + 4)(,r9: 


(fc)„( 0^2 


2=1 
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64 


12 / i ^ii ^ij 


(P 


*j=i 

d 


+ 42 (4“’ + 4‘Vf - 3m1‘' - ,<;*> - (4’')'^ + 2)(<,1‘>4V 




*,i=i 
d 


+ 4 - 84‘> - 4(44= + 164^> + 20(44= - 4}(,f 4?) 


,(4)n2 


(4) 


,(3)n2 




d 2 


2=1 


< ^A^J(Qfc°Q«)V3-^MjQfcQ?(QfcoQz)At3 


+ ?Ml)±42)!±4it,((Q^ „ q,)2, ^ 2{17c(7) + 20c(7)=} „^^p„^^p 


^54(2)||q^„q,P^64c(-,) 


+ 


d d 

8{c(7) + c(7)2 + 2} 




< 


2{24c(7)2 + 69c(7) + 1 }„^ „2„^ „2 


\\QkoQiV + 

WQkfWQif, 


d 


2 , 2{17c(7) + 20c(7)n,|^ , 12 , 1^,12 


d 


-WQkVm 


where we have used the fact that \\Qk o Qi\\ < ||(5fc||||(5/|| (Theorem 3.1 of (Horn and Mathias, 
1990)). Thus, we have proved (S19); (S20) follows immediately by symmetry. Finally, we 
bound the second moment of Observe that 


E{{Cf}^D, + D2 + D3 + D^, 


(S22) 


where 

H K{c. 0 C.c.(i + 0 ( 1 + c=)} 41444444 

m,n=l 

l^u^v^d 

= f I! H E{c,<„(4=’-3c.-ci’)(fiS>-3c„-o} 

i; T e[{(c=c=-i) + (c=-i)}((0=-i) + (c=-i))]4‘’4'4SM'.>. 
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D, = 


(P 


[{(c? - df) - 2(c? -1))"] 


We bound Di, D 2 , D^, separately. First we consider Di, the most complicated term. Define 
the diagonal matrix Mk = diag(/Xi^\ ..., and observe that 


64 


m,n=^ 


64 


^ ^ + Cm)(l + Cn)} {(limQmjQlnQnj + Qmj Qnl) 


(P 


64 


+ 9 2 (1 + C^)(l + Cm)} 


+ 


P 

64 




9 S {C^C|(i + C|)(i + Cm)} {ql^kfjql^q% + qfqf^qflcf^' 


P 

64 


l^i^j¥^Tn^d 


+ 9 S {C^C|(i + Cm)(i + C^)} {ql^q^qS^ qfj + q^^q^qfqu" 


(P 

64 


l^i^j¥^Tn^d 


^ Xj ^{CiCji^ + Cm)(i + C|)} {qimqmjqij^qfj + q^^q%qfj q^}) 


+ 


p 

64 






p 

64 






p 

64 




+ 92 ®^ {C^C|(i + C|)(i + C^)} {qfqfjqu\fj + qfqfjqfiqfh 


p 


l^i^j^d 


64 






256 


P 


+ 


P 

128 

~P 


/ j K^im^mj^in ^nj ' / 

l^i^j^m^n^d 

+ ^ Yj (1 + f^i'^^)iqu^qfjq^qmj + qfj qj^qlHi) 

2 (1 + + 4 ^^ ^jj ^jmqmi) 

l^i^j^m^d 


l^i^j^m^d 
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128 


0 w 0 , (k) 0 W On 

+ ^ 2j (1 Qii ) 

, V n , (4 )n/ (fc) (0 (fc) (0 , (*:) (0 (fc) (On 

+ ^ 2j (1 + /^i ) (9*m <lrnj Qij Qjj + Qlni Qmj Qjj Qji ) 


d? 

64 


,( 6 ) 




(*^) JO Jfc)^(0 ^ JJ JO^(fc) J0^ 


Ci2 

64 






,( 4 ) 


ik)Jl)Jk)(l) , Jk)(l)(k)Jl), 


(P 

64 




+ ^ S ^1) (4^^ ^jj ^ij + ^) 


,(4) 


ik)(l)(k)a) , (k)(l)(k)(l). 


d? 

64 




+9 2 + 


.(4) 


(^)J0Jfc)^(0 , Ak)(i)(k)(i), 


d^ 


l^i^j^d 


256 


d^ 


/ j / ] x^tm^mj^in ^nj ' ^m / 

m,n=l 

128 , _(4) 




+ 


+ 


l^iT^jp^m^d 
128 . (4) 


Z (^r “ + qfqfjqflqmi) 


d^ . ■ ■ . 

128 . (4) 


+ 


d^ 

128 

64 


Z ^ ) (dim Qmj Qu^ qfj + dim Qmj ) 


l^i¥=j¥=m^d 


Z (^f ~ 1 dmj qf qfj + qf2 q2jqfj qfi ] 


l^i¥=j¥=m^d 




d^ 

64 


l^i^j^d 


+ ^ Z “ 3) (gjf gif qf- + gif gf gjf g]?) 


.(4) , ,,(4) 


ik)Jl)Jk)(l) , (k)(l)(k)(l). 


d^ 

64 




+i s (0‘vr +0*’+Af - 3)(«i;>«j(o.?'4’ +9if js'id'') 
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256 


/ j / j K^im^mj^in ^nj ' ^ni / 


+ 


m,n=l 

256 




2 ~ ^){(lu^(lfj(lim(lmj + Qu^QfjQjmQmi 


+ (ifSfhflQmi + 

+ ^ S ^ij + 

IsSi^isSd 

, ( fc ) (0 ( fc ) (0 , ( fe ) (0 ( fe ) ( On 

+ Qii qii Qii qli + Qa qli qli qh ) 


nji '111 'iji '111 ' 'iji '111 '111 'll] 

,64 y (,,(4) (4) (4) (4) ow„(fc)„(0„(fc)„(0 , 

^ Zj ~'^)\qii qij qjj qji +^qii qtjqtj q: 

l^i^j^d 

+ q^^qjUj’lUii) 

+ qimqmjqfJqn^^ 2 


d 2 


2,^,771,71=1 


• 1 
2 , 7 , 771=1 


^ y (4) _ (fc) (0 (A:) (0 (fc) (0 (fc) 

^ ^2 / J v/^2 ^ZJ ^2771 ^7717 ^ ^22 ^ZJ 

2 , 7 , 771=1 

Sk)Ai)Sk)Ai) ^ Jfc) JO Jfc) JO 


+ oJJJ J"^ J‘^. + oJJJflW.W.) 
' 'I72 'ill ^7771^7712 ' ^72 ^22 ^2771^7717/ 

256 


2 


*j=i 
d 

(k) (1) (k) (0 , (fc) (/) (/c) (/) 

^2 Zj vPi - qijqij^ + 

*5=1 

+ qjfquqjfqfi + qfiqfh^^^fj) 

s (mS" - Dft'fos:'j'o? 

Oi=i 
d 


(P 




2 = 1 


+ S Zi “ ^){q^\fj q^^fj + q^\fUfi^ii 


p 


Oi=i 
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, (k) (0 (fc) (0 , (fc) (0 (fc) (1). 

+ (Tji (Tu (Tji (Tii + Q)i <Ej) 


^ji ^ji ' ^ji ^ij 

.(4),,(4) , ,.(4) , ,,(4) nxtjk)(l)(k)(l) 


*J =1 

, (fc) (0 (fc) (On 

+ Qf/QfXi Qfi) 


256 f (6) 

(P 


2=1 


(4) 


,W„( 0 n 2 


256 y ( WWW WWW (On 5^ y „W„(0„W W 


(P 


i,j,m,n=l 


^ (p / J ■*-/Vy22 ^2^ ' ^22 ^2^ 

2 , 7 .m=l 




,(4) 


2,7,271=1 


,(fc) JO Jfc) JO , Jfc) JO Jfc) JO 


2,7,271 


+ qfQuQ^lQmi + qfqfh^^qmj) 

+ 52<4'’-2j' + l){(,i‘'j0 + (j9l'’0} 


+ ^ J (0°’ - W/'!*’ + S)qfqS’qfq 


*J = 1 
d 


P 


64 


,{k)ji)jk) (i) 
ji 




+ i 2 (0‘'J - 30‘> H- wf H- 1)(<J J J'J + of J J J 


,(4) , ,,(4) 


ik)P)Jk)(l) , Jfc) JO Jfc) JO 


Oi=i 

, (fc) (0 (fc) (0 , (fc) (0 (fc) (z)n 
+ Qn qiiQiiQn + Qn QiiQii q] ) 


dji Hii Hjj Hji Hji 

,(4)^2 


256 ^ (6) , i , (4)^2 (4) 

P 


ZH +(/^r0^-4j: 

2=1 


ij ^33 
+ 2 


J)J0j 
• 22 ^22 ) 


2^6 ^12 

— {tr(Qfc( 3 i( 3 z( 3 fc) + tr(QfcQzQfcQz)} - -^tr o (Q^Qz)} 

256 

[tr {(M4 - I)Q^QiQiQk] + tr{(M4 - I)Q^QiQkQiW 


+ 


+ 


+ 


P 

256 


P 

64 


P 

128 


[tr {(M4 - l)QYQkQkQi] + tr {(M4 - l)QYQkQiQk]] 

[tr {(Me - 2M4 + l)gi;’QzQzQr} + tr {(Mg - 2M4 + l)Q]^QkQkQ^]] 
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64 


+ -tr {M,Q^QiM,Q^Qk - 3M,Q^QiQ^Qk + Q^QiM.Q^Q^ + Q^QiQ?Qk} 


d? 

64 


+ -ti{M,Q^QiM,Q^Qi - 3M,Q^QiQ^Qi + Q^QiM.Q^Qi + Q^QiQ^Qi] 


(P 

64 

+ ^tr {M^QfQkM^Q^Qi - 3M^QYQkQ^Qi + Q^QuM^Q^Qi + Q^QuO^Qi] 
64 

+ —tr {M^QYQkM^QYQk - 3 M^QYQuQ]^Qu + QYQuM^Q^^Qu + QYQkQYQk] 


256 

~d? 


tr {(Mg + Ml - AM^ + 2)(g°g°)2} 


< + ^\mw + 

128 { 3 c ( 7 ) + 1 }„^ „ 2„^„2 , 128 { 11 c ( 7 ) + 9 }„^ „2 


+ 


+ 


\\QkV\\Qir + 


d 

256{c(7)^ + 4c(7) + 1 } 


d 


iigfciHlQiir + 


^ -WQkVWQif 

256{c(7)^ + 5c(7) + 2} 
d 


WQkfWQif 


512{c(7)2 + 10 c( 7 ) + 8 } 
d 


mim 


(S23) 


To bound D 2 , 
16 


^2 = ^ S E{C>®-30-a^ 


16 ^ 

+ ^ 2 E 


WJI) 

ij ^ii 


Qu + Qk^Qij? 


d? 


E 


^ ,y^-3Q-C 


(3) 


i\h^i 


-30-C 


16 

P 




ik)Jl)\2 

ij ) 


1^0^ j^d 
16 




,( 4 ) 


W JO 


,WJ 0 ^/'Jfc)J 0 


Jfc)J0' 


p 


l^i^j^d 


V I,.( 6 ) 

P 


2, {+6jf^ - +9| 


,(3)^2 


*J = 1 
16 


{k)(l) (k)(l)^2 




,( 4 ) 


(fc )„(0 


Jfc)J 0 ^/„(fc)J 0 


Jfc) 


d 2 


*J = 1 
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2=1 


^ 16{7c(7) + 9} Y , 32{7c( 7) + 9} y JO , 16{7c(7) + 9} y jk) 

^ ~ / i ' ' J2 / j ^ij ^ij j2 / i \^ii 


i,j=l 




*J = 1 


d? 

.6 

J2 ^ 'll! 'll] 'ijj • ^2 

*J = 1 

+ § t + 3)( J H- 3) J J J4’ + I 2 + 3)( J + 3) J J J? 


+ § 2 (m'*' + 3)(m 7 + 3),f 7'J4’ + § S (Ml*’ + 3)( J + 3),<71'’J J 


*J = 1 
d 


*J = 1 


*J = 1 


+ 5^iiQ4nio<r 

16(7i:(7) + 9} ^ 2tr(Q^Q,Q,QP) + tr{0“0,Q,0“)] 

16 

+ — [tr {(M 4 + 3I)QfQk(M4 + 3/)gPQ4 + tr {(M 4 + 3 /)QPgfc(M 4 + 3/)QPQ4] 
16 

+ - [tr {(M 4 + 3 /)gPQK ^4 + 3/)gpg4 + tr {(M 4 + 3 /)gpgKM 4 + 3/)gpgJ] 


+ 


32c(jf 

d 


< 


64{7c(7) + 9},,^ 211^112 , 64{c(7) + 3}2 ,, 211 ^ ns , 32c(7)2 


d 


■iigfcii iig«ii + 


d 


'\\Qk\\ \\Qi\\ + 


d 


WQkfWQif 


32{3c(7)2 + 26c(7) + 36} 
“ d 

Next, we bound D^: 

64 


mw- 


(S24) 


D. = 


d^ 


Ya -1) + (C| - 1)}{(C-d -1) + (C^ -1)}] 


+ “ 2 e[{(7c|-i) + (C|-i)}{(C71-i) + (C1-i)}] J’JsfJi' 


+ 


d^ 

64 

64 




Yj e [f(c?Cj -1) + ((] - i)}{(7cf -1) + id - 1 )}] J 




+ - Yj ^ [{(C^Cl - 1) + (C| - 1)}{(C^C| - 1) + (C| - 1)1] dffqScQq. 
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+ 


64 

64 


2 E [{(C?C| - 1) + (C| - 1)){(C?C| - 1) + (C| 




+ ^ 2 E[{(C?C|-1) + (C|-1)){(C|C?-1) + (C? 




64 




+ 


+ 


64 

64 




2 “ 4) ^m] Qmj 




+ S S ((4^>+3)^f-4}(,.'f4') 




Ci 2 

64 


l^i^j^d 


+ 9 S (MfVf 


-,ik)Al)\2 




l^i^j^d 


512 


d^ 




l^i¥=j¥=m^d 

64 


+ 9 2 (24*'4‘’ + 44^’+4‘’-7)(,lf,®) 


.W^( 0^2 


d 2 




d 


^2 / j vr^2 ^'i-3 ^im^im 

ij,m=l 


512 

IP 

512 

~P 

512 






*5=1 

d 


E (*<r - 1)(4'‘'4''>) 


(fc) J 0 n 2 


*5=1 

d 


E (Mf -1).<‘5<'5<‘5<'’ 


*5=1 


1 )}] («“ 4’)' 
1 )}] («lf4’)' 
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1024 






(p 


2=1 


, (4) (4) , ^ (4) , (4) ,,W (k) («)\2 

+ 2j M viQij Qh ) 


2 = 1 


d 


512 


S (4*' -1)4 


(k) (i) (fe) (0 
ij Qij QimQirn “ 


1024 , f4'i .s (fc) (0 (fc) (i) 


2,7,772=1 

64 


(i2 


2 - 1)^^ 


II 9m Qij Qij 


*J=1 


+ iS(24‘Vr-44‘> + 4‘’i-i)(4f4'')^ 

i,j = l 


-^i:{2(i‘r’)^-iii‘r’+9}(4‘’4'') 

2 = 1 


{k)Jl)\2 


4r{(M4 - i){QkQi) o (g^goi - ^tr{(M4 - gg^g^g^gj 


(P 


128 


+ —{^4(gfe o Qi)M/^{Qk o g^) — 2(gfc o Qi)M4{Qk o g^)} 

64 64 

+ {M4(gfc o QiY + (g*: o g^)^} — ■[(2M4 — 11M4 + 9)(g;i.g^ )^| 


< 


(P 

128{2c(7)2 + 20c(7) + 17} 
d 


ig^iriig/ir- 


It remains to bound D^, but this is easy. By (S13), 


£>4 ^ 


Combining (S22) and (S23)-(S26) yields 


d 


(S25) 


(S26) 




The lemma follows. 


□ 


Lemma S3. Assume that the random variables /li,..., /3p, ei,..., satisfy (9). Additionally, 
define 


uj{A) = 


(Ai + + 1)2 


(S27) 
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(a) There is an absolute constant 0 < C < oo such that 


sup > r 


X y ^ (7 exp 


n 


C 


—u;(A) ,-^w(A) 


7 


7 


for all r ^ 0. 

(b) Assume that no = n. There is an absolute constant 0 < C < cc such that 


P<{ sup r] \a^{r] ) - a^ir] )| > r 

0^T]^<OD 


X > < (7 exp 




for all r ^ 0. 

Proof. Define C = e"'')''' e MP+"' and tiijf) = {rfXi + 1)“^. Then 


= -C^Q(7^)C, 


n 


where Q{rf) = VT{rf)V^, T{rf) = diag{fi(? 7 ^),... and 


D = 


p-l/2xT 

I 


u. 


Thus, (T^{rf) can be expressed as a quadratic form and we can apply Theorem Iwith X = 1. 
We prove part (b) of the lemma hrst. Assume that hq = n and notice 


P<( sup r] \a^{ri ) - ao(r/ )| > r 

Oi^n^<co 


= P 

O^TJ^ 

^ + Pl\ 


X 


sup ri \C Q{r])C-^{C Qiv)C\X}\>ns 

O^ri^Koo 


X 


(S28) 


Pf = p 

sup r]‘^\C'^Q{r]‘^)C-'^{CQ{ v‘^)C\X}\ > nr 

X 



J 

+ 

II 

sup C'^Q(7^)C - E{C'^Q(7^)C > nr 

X 


l^ri^Kco 



where 
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We’ll apply Theorem 1 twice, to Pi and separately. In order to apply Theorem 1 to , 
we need to derive a Lipschitz bound, as in (1). For 0 < ^ 1, 




u 


v?\i + 1 v'^Xi + 1 


^ I 2 2| 

^ \u — V \ 


Additionally, for r]'^ ^ 0, we have the bounds 

IVW||«A, + 1, liyrtf)N 2 d' II>?W)IIhs< (^29) 

TXno + ^ \VXno + ^J 

where we have used the fact that Xn = Xno > 0. Thus, Theorem 1 implies that there is a 
constant 0 < Cf < 00 such that 


Pi < Cl exp 


n 

— —— min 


Cf 


74(Ai + 1)2’ 72(Ai + 1) 


(S30) 


whenever ^ Cf 7 ^(Ai + 

We can’t immediately apply Theorem 1 to bound Pi , because the supremum inside the 
probability is over a non-compact interval. However, observe that P^ can be rewritten as 


II 

sup 7 ‘^)c ^)C\X}\ > nr 

X 





Now we can apply Theorem 1 as soon as we derive the required Lipschitz bound. For 0 < 

< 1 , 


.-2j 


\u 


ti{u ) - V %{v ^)| = 


A,- -t- w2 A,- -t- ^2 




where we have again used the fact that A„ = A^^ > 0. Combining this with (S29) and Theorem 
1 implies that these exists a constant 0 < Ci < 00 such that 


^ exp 


n 

- 1 - mm 

L 


y(Ai + l)2(A-^-M)4’72(Ai + l)(A-^-,l)2 


(S31) 


whenever ^ C'^ 7 ^(Ai + 1 )^(A „|3 -t- l)^n Part (b) of the lemma follows by combining (S28) 
and (S30)-(S31). 

To prove part (a), drop the assumption that uq = n. Our proof strategy is the same as 
in part (b), but the proof is easier because we don’t need to worry about whether or not 
An = 0. Briefly, observe that for all 7 ^ ^ 0, ||T( 72 )|| ^ ||^(^^)||hs ^ Additionally, 
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if 0 < ^ 1, then \ti{u'^) — < Ai|n^ — and \ti{u~'^) — — v‘^\. 

Proceeding jnst as in the proof of part (b), it follows that there are constants 0 < C 2 , C 2 < 00 
snch that 


sup - Uo(?7^)| > r 


X 


^ C 2 exp 


snp \al{r]‘^) - al{r]'^)\ > r 

l!£r;2<oo 


< C 2 exp 


2 

X 

n 


n . \ r r 

-rnin < - - 

l7nAi + l)^’7"(Ai + l)2 


mm 


L l7^(Ai + 1)2(A-^ + 1)2’ y(Ai + 1)(A-^ + 1) 

whenever ^ C';^ 7 ^(Ai + and r ^ C' 2 ' 7 ”^(Ai + l)^(A“p + l)2n“\ respectively. This 

implies part (a) of the lemma. □ 

Lemma S4. Let HQ{rf) = ¥.{H^{rf)\X], where H^{rf) is given in (49). For rf ^ 0, 

^ V (7g-7^)(A.-A,)2 

2n^ ( 72 Ai + l)^(? 7 ^Aj + 1 )^’ 

^5.7 — -L 


MD - S S 


(S32) 


where F[Q{rf‘) = E{H^{ri‘^)\X}. Hence, 


^olho 


Proof. The ineqnality (S33) follows from (S32) and the identity 

-| n 1 ^ ^ ^ 

i,j = l 2=1 ^J = l 


(S33) 


To prove (S32), rewrite H^irf) as 

u ^ ^ y Ai( 7 oAi + 1) _ 2 y ^ y Vo^t + A 

^ n (^^Ai + 1)2 0 + [nf^^rj^Ai + lJ 

f v^o\ + l \ f A. _ A, A 

77,2 7^2Aj + 1/ V 72 Aj + 1 rj^Aj + 1) 
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^ y + A ( A, _ A, \ 

n? \V^^j + 1/ + 1 V'^^j + 1/ 

where the last expression is obtained by interchanging the indices i,j in the previous expres¬ 
sion. Adding the last two expressions yields 

2H (n^) = ^ Y f ^ f _^ 

0 ^2 /L y 2 y. _|_ I + ly \rfXi + 1 rfXj + 1J 

y 

n? {YXi + l)‘^{YXj + 1)^' 


Equation (S32) follows. □ 

Lemma S5. Let = 'E{H^{Y)\X}, where H^{Tf) is given in (49). Additionally, assume 

that the random variables /5i,..., /5p, ei,..., e„ satisfy (9). There is an absolute constant 0 < 
C < CO such that 


sup - hro(??^)| 

0!£»?^<oo 


> r 


A > ^ (7 exp 


n 

mm 


C [ 7^(Ai -I- 1)® ’ 7^(Ai + 1)^ 


for all r ^ 0. 

Proof. The proof is similar to that of Lemma S3. Let , e"'')''' e and rewrite 


= -CQ{v‘^)C, 

n 

where Q{rf) = VT{Y)V~^, T{rf) = diag{ti(r 72 ),.. .,tn{rf)}, 


U{rf) 


V 


Xi 


X, 


n 


2 (772Ai + 1)2(772Aj + 1)’ 


p 

I 


u. 


7 = 1, ... ,77, 


Then 


1 sup \H:,{r]^) - i7o(7^)| > r 

A i = P 

sup |C^(5(77^)C - E{C^Q(772)C}| > nr 

A 

1 0^r]^<oD 

J 

0^77^<oo 



= Pi+ P 2 , (S34) 
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where 


Pi = p 

sup \CQ{rj^)C - E{C^<5(h^)C}| > nr 

Os:T 72 !gl 

X 

II 

Cb 

sup \CQ{rf)C, - E{C'^Q(7^)C} > nr 

l^rf<ao 

X 


We bound Pi and P 2 separately, using Theorem 1. 

In order to apply Theorem 1, again we need to check the Lipschitz condition (1) and get 
bounds for ||y'''y||, ||T(0)||, and ||T(0 )||hs- If 0 < < 1, then we have the Lipschitz bound 


- U{v^)\ = 




A 7 


A,; A 7 


n + iy{u^\j + 1 ) (f^Aj + iy{v'^Xj + 1 ) 


^ 1 ^ XfXj\Xi — Xj\\v^ — u^\ 

^ n (n^Aj + iy{u‘^Xj + l)(n^Aj + iy{v‘^Xj + 1 ) 

(A," + 2A7A,)|A7-A,||n^-n^| 

^ + 1 ) 

(2Aj + Aj)|Aj — Aj||n^ — 

^+ l)^(M^Aj + l)(n^Aj + l)2(n^Aj + 1) 

< 24Xl\u'^ - v'^l, 


(S35) 


for z = 1,..., n. Additionally, for r]'^ ^ 0, 


yT^HAi + l, ||r(r72)H2Ai, ||r(r72)||^s<4AK 


(S36) 


Thus, Theorem 1 and (S35)-(S36) imply that there is a constant 0 < C*! < 00 such that 


Pi < Cl exp 


n 



74(Ai + 1)6’72(Ai + 1)3 


(S37) 


whenever ^ C'i7'^(Ai + 1)®?7, 

Turning our attention to P 2 , we have 


sup \CQ{r] 2 )C-E{C^Q (7 ^)C}|>»^ 
00725:1 



F 2 = P 


(S38) 
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We aim to apply Theorem 1 again, but first need the following Lipschitz bound: 


\ti{u - ti{v ^)| = 


n®(Aj — Xj) i 

n ^ (Aj + u‘^y(Xj + n (A* + v^)^(Xj + 


v^(Xi - Xj) 


< 


-V 

n 


XjXj\Xi-Xj\\u^ 


^ (Aj + u‘^)‘^{Xj + M^)(Aj + n^)^(Aj + 


+ 




u^v^iXf + 2 AiAj)|Ai - Aj 


m 


n (Aj + u^)^{Xj + n^)(Ai + v'^)^{Xj + 


1 


1 M'^n"‘( 2 Ai + Aj)|Aj — Aj||n^ — v^\ 

n (Aj + v?)‘^{Xj + v?‘){Xi + n2)2(Aj + 


< 241^2 




(S39) 


for 0 < ^ 1, i = 1,... ,n. Now apply Theorem 1, using (S36) and (S39), to conclude 

that there is a constant 0 < 6*2 < oo such that 


P 


sup 1C Qiv )C-E{C Qiv )C}|>?’ 

r ^ r ^2 
^ 62 exp — — mm 
(^2 


X 


(S40) 


7nAi + l)6’72(Ai + l)3 
whenever ^ C 2 'j'^{Xi + The lemma follows from (S34), (S37)-(S38), and (S40). □ 

1 


Lemma S6. Let 


xiVo,^) = 


2{r^l + l)4(Ai + l)4(A7i + 1)2 


(a) Suppose no = n. Then 

(h) If no < n, then 


2 n « / 2 ^ ^ -^o)^x(ho:A.) 22 


{\p^ - + 1)2 


t’(A), ?7,7o^O. 



4(h^) ^ 


iv^-vinivlA) (. 

(172-70^1 + 1)2 V 


no^ 

no 

n ) 

n 


7^7o ^ 0 - 
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,6^0 are nonnegative numbers to be specified further 


Proof. Let T(? 7 ^) = rf a + b, where a, 
below, and note that 


2 n I 


(S41) 


By Taylor’s theorem, 

/ ^ y; ^pAi + A / hpAj + 1 _ }_Sp Vo^i + 1 
yn ^ h^Aj + ly \h^Aj + 1 n ^ 'q'^Xi + 1 

1 f vpXj + l l^^ r7gAi + l ^ 2^2 
2h] yq^Xj + 1 q^X, + lj ' 

where hj ^ 0 is between 

Now let h ^ 0 be any number satisfying maxj=i_,,,^„ hj ^ h. Summing from j 
(S42) and plugging this in to (S41) yields 

4 (ho) - 4 (b^) ^ '^^^jJ-A{q‘^,ql), 


(S42) 


1,..., n in 


(S43) 


where 


Aiv^Vo) = ^ 


y / ?7oAj + 1 _ ^ y Vo^i + 1 
^ yj.^2x. + 1 q^Xi + 1 


^ ^ y / q^Xj + 1 \ _ ^^ y hpAj + 1 

n ■H \q‘^Xi + 1/ ^ V'^^i + 1 


j=i 


rp 


Vo - V 




y {q^Xj + l)(Aj - Xj) 

(r/ 2 \ + l)^q^Xj + 1 ) 

' y (bpAj + l)(At - Xj) 

(h^Ai + l){q^Xj + 1)2' 
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Adding the last two expressions above and dividing by two, we obtain 




- vir y (A. - A,)^ 

2n? {jfXi + lyijfXj + 1 )^ 


Thus, combining this with (S43), it follows that 

{rf a + b)"^ {rf — 


4(ho) -^o(?? ) ^ 


(A, - A,)' 


AK^in? 




(S44) 


Now we consider the cases where Uq = n and Hq < n separately. 

Suppose that no = n and let a = Ai, 6 = 1 in T{rj^). Then we can take h = (? 7 g + l)(Ai + 
1)^(A“^ + 1), and (S44) implies 






_ iv^ - vi? _ 

2(r;2Ai + l)2(r72 + l)2(Ai + l)4(A-i + l) 

_ iv^ - vir _ 

2{\v^ - vi\ + mvi + 




+ 1)2 


0 (A). 


Part (a) follows. 

Now assume that uq < n. Let a = 0, 6 = 1 and h = {rjQ + l)(Ai + 1). Then, by (S44), 






- vlf 


no 


(A, - A,)' 


4(77o + l)^(Ai + l)2n2 .2-^ [jfXi + l) 2 (r 72 A' + 1) 


{y^-vinn-no) ^ A 2 
2{r]l + l)2(Ai + l)2n2 ^ ijfXi + 1)2 

_- vlf _ 

W - vi\ + mvi + l)^(Ai + 1)^(A-„1 + 1)2 



no 

n ) 

n 


This implies part (b). □ 

Lemma S7. Assume that the random variables /?i,..., /3p, ei,..., e„ satisfy (9) and let a;(A) 
be as defined in (S27). 

(a) Suppose that uq = n. There exists an absolute constant 0 < C < oo such that 


sup \i*A-io{r]‘^)\>r 
0 : 5+<00 


a: 


p 
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^ C exp 


n 




L C f(7^ + l)(a2 + l)(Ai + l)2 


mm 


in r, l| 


for all r ^ 0. 

(b) Suppose that uq < n. There exists an absolute constant 0 < C* < oo such that 


sup - io{v‘^)\ > r 

0 !£ j 72 <oo 


< C exp 


n 


X 

atu{kf 


C 72(72 + l)(a2 + 1) 


( -i . (2 1 ) 

(^1 - j mm |r , r, 1 | 


for all r ^ 0. 

Proof. To prove this lemma, we use Lemma S3. First notice that 

4(7^) - 4 (h^) = ^ hgialir]"^)} - ^ \og{al{p^)}. 
Next, assume that n = uq. Then 


+ l)^o^(7^) = 

It follows that 


+ 1) V ^oAi + 1 ^ p^Xi + 1 


n 




2^ 2- 




_£ V 

n 




CTn 


+ 1 n \i + 1 Ai + 1 


sup %{p^) - > r 

0^rp<a3 

= f sup |log{cT^( 72 )} - log{al{p^)}\ > 2r 

I 0^ri'^<co 

= ] sup log{a^{p'^)} - \og{al{p^)} > 2r 

I O^Tj'^KOO 

u ] sup log{cT 2 ( 72 )} _ log{(T^( 72 )} > 2 r 
I 0^ri^<OD 

sup 




C \ sup (7 + 1 ) 1 ^ 0(7 ) - CT ,(7 )| > 

Os:772<oo Ai + i 


o^;'^<oo ^^2(7^) 

4crnr 
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sup 

Os:) 72 <oo -^ 1^1 + J-J 

sup iv^+ l)\cTl{r]‘^)-(Tl{r]‘^)\> 

0s:r?2<oo Ai + i 

, 2 / 2 \ _ 2 / 2 \ 


(Tn 


u sup {r] +l)\cTo{r] )-a^{7] )\> 

0^ri'^<OD + ij 


Thus, by Lemma S3 (a)-(b), there is a constant 0 < C < oo such that 

X i ^ (7 exp 


sup \(.*{rj)-fQ{ri)\>r 

0 !£r; 2 <oo 


n . 1 alu{Kf 2 


mm 


C i74(Ai + l)2 ’7 '(Ai + 1)' 


+ (7 exp 


n , [ (ToCi;(A)^ aQu{K) 


mm 


C i74(Ai + l)2’72(Ai + l) 


whenever ^ + l)^a;(A)“^n“^. Part (a) of the lemma follows. 

To prove part (b) of the lemma, assume that no < n. Then cr^iTf) ^ (To(7o-^no + l)(l~''i'o/''^)- 
Similar to the proof of part (a), it follows that 

I sup sup \al{rl^) - al{rf)\ > Aal (l - r\ 

I O^T 72 <oo J I 0 !£» 7 ^<oo n / J 

u| sup 7V)-a27)|>|(i-!^)|, 

|^ 0 !£» 72 <oo Z \ n / J 

Thus, by Lemma S3 (a), there is a constant 0 < (7 < oo such that 


P^ sup \i^{yf)-io{rf)\> r 

0^ri^<ao 


X 


< (7 exp 
+ (7 exp 


n , [ aluj{Kf ( 

— mm f- - - 1 , , , 

6 7 ^ V n / 72 


^\2 2 ugu>(A) A _ ^ 


n 

— —mm 

(_y 


ry " „ ' I 1 
7 

A no\2 aluj{X) / npX 

7 ^ \ n J ’ V n / 


n 


whenever ^ ^u’(A) ^n(n — hq ) This implies part (b). 


□ 
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Lemma S 8 . Let Q = {qij) be a d x d positive semidefinite matric and let (^ = {(i,..., e 
be a random vector with independent components that have mean zero and variance 1. 


Let and assume that < oo. Finally, define qt = 

Then 


,( 4 ) 


(4) 


,(4)^T 


Var(C'QC) = M 4 q 2 - 3||qf + 2tr(Q^) ^ 2 + d\\Q\ 


(S45) 


Proof. The inequality in (S45) is obvious. To prove the equality, we have 

Var(c^gc) = e{{Yqc)Y-^{CQcT 


= E 


= E 




tiiQy 


^.*4=1 

d 

Xi QijQkiCiCjCkCi ) - tr(Q)" 


e|T9Sc4+2e| 2 


Ydii + Qiiqjj)Ci C > - tr(Q) 


. ^=1 


= /xjqs + 2 2 ql + ^ quQjj - ^YQf 

ijtj ii^j 

= /xjq 2 - 3||q|4 + 2tr(Q^), 


as was to be shown. 


□ 





